DISTRIBUTED SYSTEMS
According to the distribution components computer based systems
can be divided two parts, standalone systems and distributed systems. We
commonly say standalone systems as desktop applications.
A distributed system consists of a collection of autonomous
computers, connected through a network and distribution middleware, which
enables computers to coordinate their activities and to share the resources of
the system, so that users perceive the system as a single integrated computing
facility.
DISTRIBUTED COMPUTING: Distributed computing is where multiple
computing units are connected to achieve a common task. The larger computing
power enables a lot more tasks to be performed than in a single unit, and
searches can be coordinated for efficiency. Successes usually give the finder
credit. Distributed computing projects include hunting large prime number, and
analyzing DNA codes.
Distributed computing is a field of computer
science that studies
distributed systems. A distributed
system is a system whose components are located on different networked
computers, which communicate
and coordinate their actions by passing
messages to one
another. The components interact with one another in order to achieve a
common goal. Three significant characteristics of distributed systems are:
concurrency of components, lack of a global clock, and independent failure of components. Examples of
distributed systems vary from SOA-based systems to massively multiplayer online games to peer-to-peer
applications.
STANDALONE SYSTEMS AND DISTRIBUTED
SYSTEMS
Standalone: A Standalone application is
a binary that can be launched directly. That is, it does not run on the Streams
runtime environment. You can instruct the compiler to generate a Standalone
application by providing the -T option at compile time.
In Standalone mode, all operators will be fused into a single partition and a
single processing element (PE) will be generated for this partition. A
Standalone executable is also generated with the name Standalone in
the output/bin directory. When
launched, this executable will load the aforementioned PE.
Distributed: A Distributed application
is an application that can be submitted to the Streams runtime environment for
execution. Unlike a Standalone application, the operators in a Distributed
application can be fused to more than one PE, and the PE(s) can be distributed
onto multiple hosts. To run this type of application on a Streams instance, you
need to provide the compiler-generated ADL to the streamtools commands.
ELEMENTS OF
DISTRIBUTED SYSTEMS
Processing components Data networks
Data
Configurations
TYPES
OF DISTRIBUTED SYSTEMS
•
Mail service (SMTP, POP3, IMAP)
•
File transferring and sharing (FTP)
•
Remote logging (telnet)
•
Games and multimedia (RTP, SIP, H.26x)
TYPES OF WEB
BASED SYSTEMS
•
Web sites
•
Web applications
•
Web services and client apps
•
Rich Internet Applications (RIAs)/Rich Webbased
Applications (RiWAs)
Client/server (client/server model,
client/server architecture)
Client/server
is a program relationship in which one program (the client) requests a
service or resource from another program (the server).
Although the
client/server model can be used by programs within a single computer, it is a
more important concept for networking. In
this case, the client establishes a connection to the server over a local area
network (LAN) or
wide-area network (WAN), such as
the Internet. Once the
server has fulfilled the client's request, the connection is terminated. Your
Web browser is a
client program that has requested a service from a server; in fact, the service
and resource the server provided is the delivery of this Web page.
Computer transactions
in which the server fulfills a request made by a client are very common and the
client/server model has become one of the central ideas of network computing.
Most business applications use the client/server model as does does the
Internet's main program, TCP/IP. For
example, when you check your bank account from your computer, a client program
in your computer forwards a request to a server program at the bank. That
program may in turn forward a request to its own client program, which then
sends a request to a database server
at another bank computer. Once your account balance has been retrieved from the
database, it is returned back to the bank data client, which in turn serves it
back to the client in your personal computer, which then displays the
information to you.
Both client
programs and server programs are often part of a larger program or application.
Because multiple client programs share the services of the same server program,
a special server called a daemon may be
activated just to await client requests. In marketing, the client/server was
once used to distinguish distributed
computing by
personal computers (PCs) from the
monolithic, centralized computing model used by mainframes. This
distinction has largely disappeared, however, as mainframes and their
applications have also turned to the client/server model and become part of
network computing.
Other program
relationship models included master/slave and
peer-to-peer (P2P). In the P2P
model, each node in the network can function as both a server and a client. In
the master/slave model, one device or process (known as the master) controls
one or more other devices or processes (known as slaves). Once the
master/slave relationship is established, the direction of control is always one
way, from the master to the slave.
SERVICE ORIENTED ARCHITECTURE
Service-oriented architecture (SOA) is a style of software design where
services are provided to the other components by application components, through a communication protocol over a
network. The basic principles of service-oriented architecture are independent
of vendors, products and technologies. A service is a discrete unit of
functionality that can be accessed remotely and acted upon and updated
independently, such as retrieving a credit card statement online.
A service
has four properties according to one of many definitions of SOA
1. It
logically represents a business activity with a specified outcome.
2. It is
self-contained.
3. It is
a black box for
its consumers.
4. It may
consist of other underlying services.
SOA was
first termed Service-Based Architecture in 1998 by a team developing
integrated foundational management services and then business process-type
services based upon units of work and using CORBA for
inter-process communications.
Different
services can be used in conjunction to provide the functionality of a
large software application, a principle SOA shares with modular programming. Service-oriented architecture integrates
distributed, separately-maintained and -deployed software components. It is
enabled by technologies and standards that facilitate components' communication
and cooperation over a network, especially over an IP network.
COMMUNICATION IN
DISTRIBUTED SYSTEMS
Interprocess communication is at the heart of all
distributed systems. It makes no sense to study distributed systems without
carefully examining the ways that processes on different machines can exchange
information. Communication in distributed systems is always based on low-level
message passing as offered by the underlying network. Expressing communication
through message passing is harder than using primitives based on shared memory,
as available for non-distributed platforms. Modern distributed systems often
consist of thousands or even millions of processes scattered across a network
with unreliable communication such as the Internet. Unless the primitive
communication facilities of computer networks are replaced by something else,
development of large-scale distributed applications is extremely difficult.
In this chapter, we start by discussing the rules
that communicating processes must adhere to, known as protocols, and
concentrate on structuring those protocols in the form of layers. We then look
at three widely-used models for communication: Remote Procedure Call (RPC),
Message-Oriented Middleware (MOM), and data streaming. We also discuss the
general problem of sending data to multiple receivers, called multicasting.
Our first model for communication in distributed
systems is the remote procedure call (RPC). An RPC aims at hiding most of the
intricacies of message passing, and is ideal for client-server applications.
In many distributed applications, communication
does not follow the rather strict pattern of client-server interaction. In
those cases, it turns out that thinking in terms of messages is more
appropriate. However, the low-level communication facilities of computer
networks are in many ways not suitable due to their lack of distribution
transparency. An alternative is to use a high-level message-queuing model, in
which communication proceeds much the same as in electronic mail systems.
Message-oriented middleware (MOM) is a subject important enough to warrant a
section of its own.
With the advent of multimedia distributed systems,
it became apparent that many systems were lacking support for communication of
continuous media, such as audio and video. What is needed is the notion of a
stream that can support the continuous flow of messages, subject to various
timing constraints. Streams are discussed in a separate section.
Finally, since our understanding of setting up
multicast facilities has improved, novel and elegant solutions for data
dissemination have emerged. We pay separate attention to this subject in the
last section of this chapter.
Data formatting/structuring
•
Plain text
•
Files (text, image)
•
Query string
•
XML
XML
Extensible Markup Language (XML) is a markup
language that
defines a set of rules for encoding documents in
a format that is both human-readable and machine-readable. The W3C' XML 1.0
Specification and several other related specifications all of
them free open standards define XML.
The design goals of XML emphasize simplicity, generality, and
usability across the Internet. It is
a textual data format with strong support via Unicode for
different human languages. Although the design of XML focuses on documents, the language is
widely used for the representation of arbitrary data
structures such as those used in web
services.
Several schema
systems exist
to aid in the definition of XML-based languages, while programmers have
developed many application programming
interfaces (APIs) to aid the processing of XML data.
APPLICATIONS OF
XML
The essence of why extensible markup languages are necessary is
explained at Markup
language (for
example, see Markup language § XML) and
at Standard
Generalized Markup Language.
Hundreds of document formats using XML syntax have been
developed, including RSS, Atom, SOAP, SVG, and XHTML. XML-based
formats have become the default for many office-productivity tools,
including Microsoft Office (Office
Open XML), OpenOffice.org and LibreOffice (OpenDocument), and Apple's iWork. XML has
also provided the base language for communication protocols such
as XMPP.
Applications for the Microsoft .NET
Framework use
XML files for configuration, and property
lists are
an implementation of configuration storage built on XML.
Many industry data standards, e.g. HL7, OTA, FpML, MISMO, NIEM, etc. are
based on XML and the rich features of the XML schema specification. Many of
these standards are quite complex and it is not uncommon for a specification to
comprise several thousand pages.
In publishing, DITA is an
XML industry data standard. XML is used extensively to underpin various
publishing formats.
XML is widely used in a Services Oriented
Architecture (SOA). Disparate systems communicate with each other by exchanging XML
messages. The message exchange format is standardised as an XML schema (XSD).
This is also referred to as the canonical schema.
XML has come into common use for the interchange of data over the
Internet. IETF RFC:
3023,
now superseded by RFC: 7303, gave rules for the construction of Internet Media Types for use
when sending XML. It also defines the media types
application/xml and text/xml, which say
only that the data is in XML, and nothing about its semantics.
RFC 7303 also
recommends that XML-based languages be given media types ending in
+xml; for example image/svg+xml for SVG.
Further guidelines for the use of XML in a networked context
appear in RFC
3470,
also known as IETF BCP 70, a document covering many aspects of designing and
deploying an XML-based language.
KEY
TERMINOLOGY
The material in this section is
based on the XML Specification. This is not an exhaustive list of all the
constructs that appear in XML; it provides an introduction to the key
constructs most often encountered in day-to-day use.
Character
An XML document is a string
of characters. Almost every legal Unicode character may appear in an XML document.
Processor
and application
The processor analyses
the markup and passes structured information to an application. The
specification places requirements on what an XML processor must do and not do,
but the application is outside its scope. The processor (as the specification
calls it) is often referred to colloquially as an XML parser.
Markup
and content
The characters making up an XML
document are divided into markup and content,
which may be distinguished by the application of simple syntactic rules.
Generally, strings that constitute markup either begin with the character < and end with a >, or they begin with the
character & and
end with a;. Strings of characters that are not markup are content. However, in
a CDATA section, the delimiters <![CDATA[ and ]]> are classified as markup,
while the text between them is classified as content. In addition, whitespace
before and after the outermost element is classified as markup.
Tag
A tag is a
markup construct that begins with < and ends with >. Tags come in three flavors:
·
start-tag, such
as <section>;
·
end-tag, such
as </section>;
·
empty-element tag, such as <line-break />.
Element
An element is a
logical document component that either begins with a start-tag and ends with a
matching end-tag or consists only of an empty-element tag. The characters
between the start-tag and end-tag, if any, are the element's content,
and may contain markup, including other elements, which are called child
elements. An example is <greeting>Hello,
world!</greeting>. Another
is <line-break />.
Attribute
An attribute is
a markup construct consisting of a name–value pair that exists within a
start-tag or empty-element tag. An example is <img
src="madonna.jpg" alt="Madonna" />, where the names of the
attributes are "src" and "alt", and their values are
"madonna.jpg" and "Madonna" respectively. Another example
is <step number="3">Connect A to B.</step>, where the name of the attribute
is "number" and its value is "3". An XML attribute can only
have a single value and each attribute can appear at most once on each element.
In the common situation where a list of multiple values is desired, this must
be done by encoding the list into a well-formed XML attribute[i] with some format beyond what XML defines
itself. Usually this is either a comma or semi-colon delimited list or, if the
individual values are known not to contain spaces,[ii] a space-delimited list can be used. <div class="inner
greeting-box">Welcome!</div>, where the attribute "class" has both
the value "inner greeting-box" and also indicates the two CSS class
names "inner" and "greeting-box".
COMPARISON BETWEEN
JSON AND XML
JSON
Pro:
·
Simple syntax, which results in less
"markup" overhead compared to XML.
·
Easy to use with JavaScript as the
markup is a subset of JS object literal notation and has the same basic data
types as JavaScript.
·
JSON Schema for description and datatype and
structure validation
·
JsonPath for extracting information in
deeply nested structures
Con:
·
Simple syntax, only a handful of different data types are
supported.
·
No support for comments.
XML
Pro:
·
Generalized markup; it is possible to
create "dialects" for any kind of purpose
·
XML
Schema for datatype, structure validation.
Makes it also possible to create new datatypes
·
XSLT for transformation into
different output formats
·
XPath/XQuery for
extracting information in deeply nested structures
·
built in support for namespaces
Con:
·
Relatively wordy compared to JSON
(results in more data for the same amount of information).

No comments:
Post a Comment