Tuesday, October 20, 2009

Armstrong Thesis Ch2: The Architectural Model

This chapter presents architecture for building fault-tolerant systems.

The keys to this software fault-tolerance are(from the paper directly):

• Software modularity through processes, and messages.
• Fault containment through fail-fast software modules.
• Process-pairs to tolerant hardware and transient software faults.
• Transaction mechanisms to provide data, and message integrity.
• Transaction mechanisms combined with process-pairs to ease exception handling, and tolerate software faults.

The main goals of programming concurrently(COPL – explained down) is reliability (through the process abstraction), scalability, security and even modularity.

Processes are stated as central items in the system and author states that the only way to have true reliability is to have each module in its own process so that errors can be contained and the process killed if an error is detected. The idea is that when the error process is killed environment can be recovered quickly as all other processes are error free and the only one which had issues have been killed. This makes the recovery easier.

Author introduced a new concept of a Concurrency Oriented Programming Language (COPL) in which programs should model the concurrent items and programmers can map the concurrency world one-to-one with the concurrency of their programs. To achieve this it is made a first class citizen of the language and the runtime environment utilizes green threads to make concurrency cheap, which gives the programmer the freedom to model all the concurrency of the problem domain. I have to read one more time to get full idea but hoping to listen to class first on this item.

Processes communicate through asynchronous message passing. So no state can be shared and process share data through message passing. This communication style supposed to enforce more reliability, since messages are asynchronous should make the system more dead lock free and programs that copy data have greater data locality and avoid false sharing, which are important for performance.

Event based and Map reduce patterns

Event based

This pattern talks about how the composed of components can post events or listen to events and a medium which is act as the bridge between the listeners and announcers. The medium dispatches the received events to affected listeners and allows listener to register and listen to events of their choice. Nicely written pattern and I used this in CS427 when I developed an android application.

Map reduce

I have introduced to this one when I did functional programming last semester using Ocaml. The idea is applied with parallelism here. The map function is applied in parallel to each object in a set, then the results are collected and combined by the reduce function to get the output. This is implicitly implemented so user’s just needs to supply the map and reduction and everything else will be taken care by the framework. This is very useful pattern as we have larger data sets and like to apply the reductions on it.

Thursday, October 15, 2009

On OPL Pattrens - Pipes & Filter, Layered Systems, Iterative Refinement

We already studied about Pipes&Filters and Layered patterns but these OPL papers present them from a parallel perspective.

The new Pipes & Filters and Layered Systems patterns are written in very simple form and give us the quick idea on the pattern.

Since these are familiar patterns to continue our conversation on these topics, I present my thoughts on Prof. Johnson’s questions –

How do the two repeats differ from the first versions that you read?
These new versions of Layered Systems and Pipes and Filers Systems presented for parallel system. For Pipes and Filers I like the solution presented as directed graph gave a good idea on the solution through this analogy.

They are shorter. Did they miss anything?
Since I am familiar with these patterns they look sufficient but after reading I thought it may be difficult for those who are learning these patterns for the first time to understand fully from these papers.

Did they include something that the first versions didn't?
First version has more details of how we can implement these patterns than this version but I like the idea of presenting the solution for pipes-n-filters pattern for parallelism through a directed graph.

Did you learn anything from them?

Not really. But like the way they presented simple and straight forward.

Do you have any advice to the authors of these patterns?
I would like to suggest presenting more on how these patterns (since they are known and common patterns for the sequential app designers) could be used or modified to use parallelism advantages more explicitly.

For the new pattern, have you seen programs that used it?
I didn’t use it before but it reminded me of general programming loops and Danny Dig’s paper ReLooper and felt familiar to me.

Chess - Finding and Reproducing Heisenbugs in Concurrent Programs

CHESS is an automated concurrency testing tool developed my Microsoft for finding and reproducing Heisenbugs in the concurrent programs. “Heisenbugs” are bugs resulted from unexpected interference among threads, which are extremely difficult to reproduce and eliminate. This type of bugs may show up in systems that may otherwise have been running stable for months and their cause may be as simple as adding some extra debug statements.

When Chess is attached to a concurrent program, it takes complete control over the scheduling of threads and asynchronous events and captures all the interleaving non determinism in the program. This provides a way to find all the execution paths results in an error and can reproduce. The challenge here is not only to integrate CHESS with complex APIs but also to make sure that it does not introduce extra behavior that is not related to the concurrent program.

The most important component of CHESS, the scheduler and the primary goal of the CHESS scheduler is to capture all the nondeterminism during a program execution. This allows the scheduler to later reproduce a chosen concurrent execution by replaying these nondeterministic choices. It is implemented by redirecting calls to concurrency primitives to alternate implementations from a wrapper library. Including an implementation of these primitives as part of the program is a design choice that may simplify the schedule which now needs to understand only simpler system primitives that the lock implementation uses. However the tradeoff with this approach is that it does not allow the scheduler to expose all the nondeterminism in the lock implementation. Another design decision meant to simplify the implementation of the scheduler is to abstract the execution of the concurrent program using Lamport’s happens-before graph, which is a graph that captures the relative execution order of threads in a concurrent program. This approach offers a common framework for reasoning about all synchronization primitives used in the program which abstracts away the timing of instructions in the execution.

One more important piece of chess is dealing with data races uses single threaded execution. This may slow down the execution a bit but it can be compensated by running multiple instances in parallel, each exploring a different part of the interleaving state space.

I think this is very good tool to capture bugs that are uncaught by stress testing and it is also tracking where the bugs are accruing in the execution and this make the tool very valuable. It will be nice to try out the tool in future.

Tuesday, October 13, 2009

Refactoring for Reentrancy

This paper is the third one on automatically refactoring programs for parallelism. In this case, the tool enabled programmers to change programs to make them reentrant, by removing global references, using lazy initializers and warning about unsafe API calls. After reading this paper, I think the tool could be useful in some situations where a program is re-factored with the goal of running several copies simultaneously but overall this tool is not very impressive and may need lot more work to be able to recognize by the users as a good tool.

In summary, the contributions of this paper are as follows (directly from the paper):

1. Characterization of the causes of non-reentrancy in Java programs,

2. Mostly-automated refactoring for making Java programs reentrant by replacing global state with thread-local state, with associated precondition checking that warns of possible behavioral changes due to fragile static initializer behavior and non-reentrant library usage,

3. An implementation of a refactoring tool, Re-entrancer, as an extension to the Eclipse JDT, and

4. An evaluation of Re-entrancer on a set of Java applications, demonstrating its practicability.


The refactoring is comprised of the following 5 steps:

1. Removal of non-reentrant accesses to library global state.

2. Encapsulation of static fields in the application in getter/ setter methods.

3. Replacing static initializers with explicit lazy initialization.

4. Replacing global state with thread-local state.

The refactoring tool does all these automatically, and gives the developer warnings before doing so. Given benchmarks for analyzing the results were far from impressive - some programs ran even slower despite using multiple threads due to the overhead created by the getters and setters. The program could also take substantial time to run. I would also have liked more reasoning on why some methods were rejected at the beginning of the paper, such as making an environment block. It requires a lot of code change, but this approach did to, it would be interesting to see them contrasted.

Thursday, October 8, 2009

BA CH13: Object Oriented Vs Functional Programming

This chapter authors compare and contrast of functional programming versus object-oriented programming. OO is discovered later to functional language sets and developed nicely. So, I am not sure how we can successfully compare these 2. I only did functional programming in few of school projects (when it’s really necessary :) and not more than that but I am OO fan and I will continue to be in the coming years too.

As author mentioned in the chapter, In Object oriented programming everything is an object and will have a state and behavior. This style provides modularity and information hiding, reusability, dynamic binding, inheritance among many others.

In functional programming, a set of functions each of which performing a task. Executing these set of function results in the solution to the problem. Like in OO, functional programming is also achieving modularity, by the use of stateless functions, high level functions, and other recursively defined types, as well as lazy evaluation, so this is only shows fine-grain modularization. I think this makes it harder to use it to build larger applications

Q1: What types of problems lend themselves to a Functional Solution?

Mathematical problems that do not require remembrance of state but simply produce results after the calculations.

Q2: What types of problems lend themselves to an Object-Oriented Solution?

Real life problems, Objects that will have state and can represent items occurring in real life and can communicate with other objects and accomplish solutions.

Q3: Is the Functional universe more flexible from a software Architecture point of view? If so, how; and of not, why?

I think, we can always just take any function and pass it to other function to wrap additional functionality around it. But I think it is difficult to manage the code like this though..

Q4: Is Object-Oriented environment more manageable as a system grows towards the enterprise level? If so, how; and of not, why?

Yes, OO system is manageable as systems grow larger. As one of the main concepts of OO it is reusable and maintainable. Additional functionality can easily be added to the existing system by simply adding new classes, inheriting, changing implementations of parent methods or over writing inherited methods, etc...

ReLooper

This paper describes an interactive refactoring tool ReLooper for parallelizing loops in Java. It does this by using the Java ParallelArray(an array data structure that supports parallel operations over the array elements) framework, which allows operations to be carried out concurrently by threads on an array. I like this tool as it is interactive and give users a chance to make sure parallelism is done correctly. I think this tool can be very useful to parallelize a loop where each iteration performs its own independent operation. To continue conversation, here are my thoughts on Danny’s questions …

Q1: Sometimes it is possible to retrofit parallelism by refactoring an existing sequential application. Other times the whole system needs to be re-architected for parallelism. What are the advantages / disadvantages of these approaches? When would you do one over the other?

Refactoring a sequential application to parallel will give better performance over all and save cost of making the application to parallel but may leave code and architecture unclean. I may prefer this approach if I have limited resources (people and money to do it) and my application has cleaner architecture in hand and it’s not very big. In this situation, I may be able to covert the application to parallel but still be able to clean the code and architecture as needed.

I may not be able to do above in the case where I don’t have architecture of the application in hand and I don‘t know how the most of the functionalities in the application are implemented (I may be using this as black box in my newer development parts of the application). In this case, conversion may cause more confusion for maintenance and even result in different behavior of the app than required. So, we should go for either full re-architecture for parallelsim.

Q2: There are many libraries that target concurrency and parallelism (e.g., Java's ParallelArray, Microsoft's TPL, Intel's TBB, OpenMP, MPI, etc.). What are the advantages and disadvantages of using parallel libraries?

Libraries will help programmers to implement parallelism quickly and easily without worrying about some of the underlying implementation details. However, programmers may miss opportunities for parallisim overlooking only the available library available items.

Q3: The approach presented in the paper puts the programmer in the driving seat: the programmer selects a snapshot of code, and a refactoring. The process is semi-automated, but not fully automatic. Compare and contrast the refactoring approach with a fully automatic approach, like the one used in automatic parallelizing compilers.

I think each have it’s own benefits and work well for their situations. A fully automated system requires less work and may be better to use it in places where we know we don’t need human interaction to convert them from sequential to parallel, but we will need a semi-automated tool like this one where we know that we need some control how the parallasim is applied to maintain the application integrity.

Q4: Some of the transformations presented in the paper make the code harder to understand (e.g., replacing a loop with a parallel operation). What are some approaches to unclutter the parallel code?

May be some commenting showing or relating the converted items to original code (sequential code) and also maintain neat structure of the code after conversion may help to read it better.

Q5: Automatic loop parallelization has been a long time topic in the Fortran community, with compilers having various degrees of success. Is this problem still relevant for Object-Oriented code? What is different now?

I think the problem is still relevant but more complex. In OO programming we have more complicated concepts like shared objects can be more difficult to detect and account for when parallelizing code.

Q6: The paper presents some empirical evaluation to support the claim that the tool is useful. What are some other factors that you would have liked to see evaluated?

I would like to see actual data size that was run on these cores and also it will be nice to see more details along with table 2 like how much human interaction needed to resolve the conflicts etc.

Q7: The paper presents several program analyses to ensure that loop iterations can run safely in parallel. Are you convinced that these analyses are enough, in other words, can you think about examples that would not be covered by the presented analyses?

Since I haven’t tried the actual tool, it’s hard to come up with a scenario where this tool is not covered but overall paper convenes me that it is safe as it clearly described that it does static analysis on the code before conversion.

Monday, October 5, 2009

BA CH 12 : When the Bazaar Sets Out to Build Cathedrals

For Tuesday's class we will be studying BA chpater 12 and i will be leading the discussion. I am hoping to explore the following questions:


In page 283, Authors talk about some key decisions that KDE made between QT V3 to QT V4 release and favor to redesign majority of the KDE and this was risky and needed to support KDE QT 3 in parallel until completing the KDE QT 4. I think redesign is a wise decision and time proved indeed it is, “made KDE very stable and platform independent structure and this drove more users to use KDE and so on…” What can we learn from decision making? Do you have any similar experience? Do Cathedral type projects need this type of decision making in the middle of construction of the projects?


We see important projects like “PIM (personal information maintenance -Bazaar structure) is developed by open source community. Author mentioned some key technical decisions like dropping the idea if it doesn’t work in reasonable time frame; maintain the core functionality working all the time while the changes have been made to the project, etc… What are some other ideas that you get in mind while you are reading these technical decisions or some other ideas that we may have seen in cathedral type project building (in our work places)?


After reading the Evolution of Akonadi, KDEPIM community depended on architecture meetings / conference secessions to discuss the key architecture items and author mentioned about most of the high level / key items are agreed but implementation of individual items still faced some heated discussions. Have you had any similar experience where you agreed to key concept but varied how it needs to be implemented either in bazaar / cathedral model implementation of the project?


After reading the chapter I am, one more time convinced that complete problem analyzation is needed to get the powerful and complete solutions. (If we know problem excited with current piece of design / code we will fix it at some point?), Do you agree?


What are strengths of the Bazaar Structure that out build the cathedral structures that we have known?