The Art Of Technology: DISTRIBUTED SYSTEMS

DISTRIBUTED SYSTEMS

According to the distribution components computer based systems can be divided two parts, standalone systems and distributed systems. We commonly say standalone systems as desktop applications.

A distributed system consists of a collection of autonomous computers, connected through a network and distribution middleware, which enables computers to coordinate their activities and to share the resources of the system, so that users perceive the system as a single integrated computing facility.

DISTRIBUTED COMPUTING: Distributed computing is where multiple computing units are connected to achieve a common task. The larger computing power enables a lot more tasks to be performed than in a single unit, and searches can be coordinated for efficiency. Successes usually give the finder credit. Distributed computing projects include hunting large prime number, and analyzing DNA codes.

Distributed computing is a field of computer science that studies distributed systems. A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another. The components interact with one another in order to achieve a common goal. Three significant characteristics of distributed systems are: concurrency of components, lack of a global clock, and independent failure of components. Examples of distributed systems vary from SOA-based systems to massively multiplayer online games to peer-to-peer applications.

STANDALONE SYSTEMS AND DISTRIBUTED SYSTEMS

Standalone: A Standalone application is a binary that can be launched directly. That is, it does not run on the Streams runtime environment. You can instruct the compiler to generate a Standalone application by providing the -T option at compile time. In Standalone mode, all operators will be fused into a single partition and a single processing element (PE) will be generated for this partition. A Standalone executable is also generated with the name Standalone in the output/bin directory. When launched, this executable will load the aforementioned PE.

Distributed: A Distributed application is an application that can be submitted to the Streams runtime environment for execution. Unlike a Standalone application, the operators in a Distributed application can be fused to more than one PE, and the PE(s) can be distributed onto multiple hosts. To run this type of application on a Streams instance, you need to provide the compiler-generated ADL to the streamtools commands.

ELEMENTS OF DISTRIBUTED SYSTEMS

Processing components Data networks

Data Configurations

TYPES OF DISTRIBUTED SYSTEMS

• Mail service (SMTP, POP3, IMAP)

• File transferring and sharing (FTP)

• Remote logging (telnet)

• Games and multimedia (RTP, SIP, H.26x)

TYPES OF WEB BASED SYSTEMS

• Web sites

• Web applications

• Web services and client apps

• Rich Internet Applications (RIAs)/Rich Webbased

Applications (RiWAs)

Client/server (client/server model, client/server architecture)

Client/server is a program relationship in which one program (the client) requests a service or resource from another program (the server).

Although the client/server model can be used by programs within a single computer, it is a more important concept for networking. In this case, the client establishes a connection to the server over a local area network (LAN) or wide-area network (WAN), such as the Internet. Once the server has fulfilled the client's request, the connection is terminated. Your Web browser is a client program that has requested a service from a server; in fact, the service and resource the server provided is the delivery of this Web page.

Computer transactions in which the server fulfills a request made by a client are very common and the client/server model has become one of the central ideas of network computing. Most business applications use the client/server model as does does the Internet's main program, TCP/IP. For example, when you check your bank account from your computer, a client program in your computer forwards a request to a server program at the bank. That program may in turn forward a request to its own client program, which then sends a request to a database server at another bank computer. Once your account balance has been retrieved from the database, it is returned back to the bank data client, which in turn serves it back to the client in your personal computer, which then displays the information to you.

Both client programs and server programs are often part of a larger program or application. Because multiple client programs share the services of the same server program, a special server called a daemon may be activated just to await client requests. In marketing, the client/server was once used to distinguish distributed computing by personal computers (PCs) from the monolithic, centralized computing model used by mainframes. This distinction has largely disappeared, however, as mainframes and their applications have also turned to the client/server model and become part of network computing.

Other program relationship models included master/slave and peer-to-peer (P2P). In the P2P model, each node in the network can function as both a server and a client. In the master/slave model, one device or process (known as the master) controls one or more other devices or processes (known as slaves). Once the master/slave relationship is established, the direction of control is always one way, from the master to the slave.

SERVICE ORIENTED ARCHITECTURE

Service-oriented architecture (SOA) is a style of software design where services are provided to the other components by application components, through a communication protocol over a network. The basic principles of service-oriented architecture are independent of vendors, products and technologies. A service is a discrete unit of functionality that can be accessed remotely and acted upon and updated independently, such as retrieving a credit card statement online.

A service has four properties according to one of many definitions of SOA

1. It logically represents a business activity with a specified outcome.

2. It is self-contained.

3. It is a black box for its consumers.

4. It may consist of other underlying services.

SOA was first termed Service-Based Architecturein 1998 by a team developing integrated foundational management services and then business process-type services based upon units of work and using CORBA for inter-process communications.

Different services can be used in conjunction to provide the functionality of a large software application, a principle SOA shares with modular programming. Service-oriented architecture integrates distributed, separately-maintained and -deployed software components. It is enabled by technologies and standards that facilitate components' communication and cooperation over a network, especially over an IP network.

COMMUNICATION IN DISTRIBUTED SYSTEMS

Interprocess communication is at the heart of all distributed systems. It makes no sense to study distributed systems without carefully examining the ways that processes on different machines can exchange information. Communication in distributed systems is always based on low-level message passing as offered by the underlying network. Expressing communication through message passing is harder than using primitives based on shared memory, as available for non-distributed platforms. Modern distributed systems often consist of thousands or even millions of processes scattered across a network with unreliable communication such as the Internet. Unless the primitive communication facilities of computer networks are replaced by something else, development of large-scale distributed applications is extremely difficult.

In this chapter, we start by discussing the rules that communicating processes must adhere to, known as protocols, and concentrate on structuring those protocols in the form of layers. We then look at three widely-used models for communication: Remote Procedure Call (RPC), Message-Oriented Middleware (MOM), and data streaming. We also discuss the general problem of sending data to multiple receivers, called multicasting.

Our first model for communication in distributed systems is the remote procedure call (RPC). An RPC aims at hiding most of the intricacies of message passing, and is ideal for client-server applications.

In many distributed applications, communication does not follow the rather strict pattern of client-server interaction. In those cases, it turns out that thinking in terms of messages is more appropriate. However, the low-level communication facilities of computer networks are in many ways not suitable due to their lack of distribution transparency. An alternative is to use a high-level message-queuing model, in which communication proceeds much the same as in electronic mail systems. Message-oriented middleware (MOM) is a subject important enough to warrant a section of its own.

With the advent of multimedia distributed systems, it became apparent that many systems were lacking support for communication of continuous media, such as audio and video. What is needed is the notion of a stream that can support the continuous flow of messages, subject to various timing constraints. Streams are discussed in a separate section.

Finally, since our understanding of setting up multicast facilities has improved, novel and elegant solutions for data dissemination have emerged. We pay separate attention to this subject in the last section of this chapter.

Data formatting/structuring

• Plain text

• Files (text, image)

• Query string

• XML

XML

Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The W3C' XML 1.0 Specification and several other related specificationsall of them free open standards define XML.

The design goals of XML emphasize simplicity, generality, and usability across the Internet. It is a textual data format with strong support via Unicode for different human languages. Although the design of XML focuses on documents, the language is widely used for the representation of arbitrary data structures such as those used in web services.

Several schema systems exist to aid in the definition of XML-based languages, while programmers have developed many application programming interfaces (APIs) to aid the processing of XML data.

APPLICATIONS OF XML

The essence of why extensible markup languages are necessary is explained at Markup language (for example, see Markup language § XML) and at Standard Generalized Markup Language.

Hundreds of document formats using XML syntax have been developed, including RSS, Atom, SOAP, SVG, and XHTML. XML-based formats have become the default for many office-productivity tools, including Microsoft Office (Office Open XML), OpenOffice.org and LibreOffice (OpenDocument), and Apple's iWork. XML has also provided the base language for communication protocols such as XMPP. Applications for the Microsoft .NET Framework use XML files for configuration, and property lists are an implementation of configuration storage built on XML.

Many industry data standards, e.g. HL7, OTA, FpML, MISMO, NIEM, etc. are based on XML and the rich features of the XML schema specification. Many of these standards are quite complex and it is not uncommon for a specification to comprise several thousand pages.

In publishing, DITA is an XML industry data standard. XML is used extensively to underpin various publishing formats.

XML is widely used in a Services Oriented Architecture (SOA). Disparate systems communicate with each other by exchanging XML messages. The message exchange format is standardised as an XML schema (XSD). This is also referred to as the canonical schema.

XML has come into common use for the interchange of data over the Internet. IETF RFC: 3023, now superseded by RFC: 7303, gave rules for the construction of Internet Media Types for use when sending XML. It also defines the media types application/xml and text/xml, which say only that the data is in XML, and nothing about its semantics.

RFC 7303 also recommends that XML-based languages be given media types ending in +xml; for example image/svg+xml for SVG.

Further guidelines for the use of XML in a networked context appear in RFC 3470, also known as IETF BCP 70, a document covering many aspects of designing and deploying an XML-based language.

KEY TERMINOLOGY

The material in this section is based on the XML Specification. This is not an exhaustive list of all the constructs that appear in XML; it provides an introduction to the key constructs most often encountered in day-to-day use.

Character

An XML document is a string of characters. Almost every legal Unicode character may appear in an XML document.

Processor and application

The processor analyses the markup and passes structured information to an application. The specification places requirements on what an XML processor must do and not do, but the application is outside its scope. The processor (as the specification calls it) is often referred to colloquially as an XML parser.

Markup and content

The characters making up an XML document are divided into markup and content, which may be distinguished by the application of simple syntactic rules. Generally, strings that constitute markup either begin with the character < and end with a >, or they begin with the character & and end with a;. Strings of characters that are not markup are content. However, in a CDATA section, the delimiters <![CDATA[ and ]]> are classified as markup, while the text between them is classified as content. In addition, whitespace before and after the outermost element is classified as markup.

Tag

A tag is a markup construct that begins with < and ends with >. Tags come in three flavors:

· start-tag, such as <section>;

· end-tag, such as </section>;

· empty-element tag, such as <line-break />.

Element

An element is a logical document component that either begins with a start-tag and ends with a matching end-tag or consists only of an empty-element tag. The characters between the start-tag and end-tag, if any, are the element's content, and may contain markup, including other elements, which are called child elements. An example is <greeting>Hello, world!</greeting>. Another is <line-break />.

Attribute

An attribute is a markup construct consisting of a name–value pair that exists within a start-tag or empty-element tag. An example is <img src="madonna.jpg" alt="Madonna" />, where the names of the attributes are "src" and "alt", and their values are "madonna.jpg" and "Madonna" respectively. Another example is <step number="3">Connect A to B.</step>, where the name of the attribute is "number" and its value is "3". An XML attribute can only have a single value and each attribute can appear at most once on each element. In the common situation where a list of multiple values is desired, this must be done by encoding the list into a well-formed XML attribute^[i] with some format beyond what XML defines itself. Usually this is either a comma or semi-colon delimited list or, if the individual values are known not to contain spaces,^[ii] a space-delimited list can be used. <div class="inner greeting-box">Welcome!</div>, where the attribute "class" has both the value "inner greeting-box" and also indicates the two CSS class names "inner" and "greeting-box".

COMPARISON BETWEEN JSON AND XML

JSON

Pro:

· Simple syntax, which results in less "markup" overhead compared to XML.

· Easy to use with JavaScript as the markup is a subset of JS object literal notation and has the same basic data types as JavaScript.

· JSON Schema for description and datatype and structure validation

· JsonPath for extracting information in deeply nested structures

Con:

· Simple syntax, only a handful of different data types are supported.

· No support for comments.

XML

Pro:

· Generalized markup; it is possible to create "dialects" for any kind of purpose

· XML Schema for datatype, structure validation. Makes it also possible to create new datatypes

· XSLT for transformation into different output formats

· XPath/XQuery for extracting information in deeply nested structures

· built in support for namespaces

Con:

· Relatively wordy compared to JSON (results in more data for the same amount of information).

The Art Of Technology

Friday, 8 March 2019

DISTRIBUTED SYSTEMS

JSON

XML

No comments:

Post a Comment

Blog Archive