Management of Distributed Data Tomasz Mldner Elhadi Shakshuki
Management of Distributed Data Tomasz Müldner, Elhadi Shakshuki*, Zhonghai Luo and Michael Powell Jodrey School of Computer Science, Acadia University, Wolfville, NS, Canada * presenting 1
Contents of the Talk • General goals for systems that implement data distribution and remote program invocation • Distributed Data Manager, DDM • Conclusions • Future Work 2
General Goals • • • Platform Independence Security Scalability Pull and Push Efficiency Synchronization Fault Tolerance Flexibility Selective Choice Remote Invocation 3
DDM: Distributed Data Manager • Two kinds of applications: – subscriber expresses interest in some data that can be provided by a publisher – publisher selects subscribers and specifies data to be sent to these subscribers. • A single application may be at the same time both, a publisher and a subscriber • SSL is used to support secure transmission • DDM extends our previous system, called DI. 4
DDM: Channel • An abstract concept that represents a virtual, unidirectional connection between two nodes (similar to a pipe used in P 2 P systems). • Two kinds of channels: – P-channel, created by a publisher. – S-channel, created by a subscriber. • Each channel has – name – optional description 5
DDM: P-channel Each channel specifies the source of its data, i. e. the directory in the file system on the publisher’s computer. 6
DDM: P-channel • the publisher can push, or distribute some or all of P-channels to one or more selected subscribers • when the subscriber receives the incoming Pchannel, it will automatically create the corresponding S-channel based on – publisher's IP and port number – the name of the P-channel. • any data in the P-channel will be pushed into the newly created S-channel. 7
Publisher’s Configuration Files • publisher. properties (static) – port number on which the publisher will send data; – a root directory for storing all channels’ description; – the filename and password for the keystore used to establish secure SSL communication path; • target. properties (dynamic) – lists all potential subscribers, can be changed at runtime. 8
Publisher’s Dynamic Files End-users can select at runtime one or more target clients to start the remote program at same time: 9
Publisher’s Dynamic Files If there are multiple applications in a channel, the user can select one of them from the list to start: 10
Subscriber’s Configuration Files • subscriber. properties – – port number for receiving data; a root directory for storing all incoming channels’ data; a channel creation type the filename and password for the keystore used to establish secure SSL communication path; – a security flag indicating whether or not the data transmission is secure (true by default); – a channel update type – a channel update option 11
Relative Virtual Mapping Example: 1. A publisher P creates a channel "quote", and specifies the corresponding directory "c: programquote". 2. A publisher distributes the "quote" channel to a subscriber S with all channel data (it will include all files and sub-directories under "c: programquote"). 3. In S, a channel named "quote" is automatically created, and all received data (files and subdirectories) are stored with the original hierarchy under the root channel directory (specified in the configuration file subscriber. properties) 12
S-channel • the S-channel is the reference to the corresponding Pchannel • typically created automatically by subscriber software when receiving incoming P-channels pushed by publishers • the subscriber can also manually create an S-channel: • the subscriber must know in advance the available channel names in publisher side • if the channel name specified in S-channel does not exist in publisher side, this S-channel is invalid, just like that you set wrong frequency for a TV station channel. 13
S-channel GUI with buttons to create, modify, delete S-channels, and update the selected S-channel, i. e. pull data for this channel from specified publisher side. 14
New S-channel • An address of the publisher node • channel name • description, • update type and option (they can also be used for an existing Schannel, this way, the subscriber can control the way data are received) 15
Remote Invocation Machine N 1 Components: C 1 C 2 C 3 Machine N 2 Machine N 3 To distribute and to invoke components C 1, C 2, C 3: 1. A publisher (on N) creates three channels; one for each of N 1, N 2 and N 3. 2. The publisher specifies the Main class in the descriptor file app. xml file (explained in the next slide) for each channel. 3. The publisher adds addresses of N 1, N 2 and N 3 to the file target. properties. 4. The publisher distributes all channels to three subscribers 5. The publisher uses the "remote start" button 16
XML-based Descriptor Defines the startup class for the distributed program and the necessary path information to load dependent classes for this program, for example: <? xml version="1. 0" encoding="ISO-8859 -1"? > <App name="CS Lab course"> <Main. Class> ca. acadiau. cs. Lab </Main. Class> <Class. Path> client. jar </Class. Path> </App> Should be located in the channel and sent to the subscriber along with other channel’s content. 17
XML-based Descriptor A program can be remotely started with predefined policy file and arguments. <? xml version="1. 0" encoding="ISO 8859 -1"? > <App name="CS Lab course"> <Main. Class> cs. Lab </Main. Class> <Class. Path> client. jar </Class. Path> <Policy. File> policy. txt </Policy. File> <Arguments> Michael: 27 </Arguments> </App> 18
Conclusions • DDM provides both, data distribution and remote invocation • It can be used with firewalls: Consider a specific example of a campus that uses a firewall: – It is possible to set up an S-channel from an offcampus machine to a machine inside the firewall – the publisher inside the firewall can push data to the off-campus machine 19
Future Work • the current interface is GUI-based, and for some applications it will be useful to provide a text-based interface. • to improve the flexibility of our system, we will add a publisher broker, which will maintain all available publishers (and their channels) so that subscribers can dynamically set and modify these data • support invocation of non-Java programs. 20
- Slides: 20