Development ideas: October 2008

Tuesday, October 28, 2008

Remoting through interfaces

When we are going to use Remoting, we always think about the proxy classes, which should surely be deployed with the client-app. But is that the best approach?
The OOP tells us to understand the interaction through the intercaces, this means that components should interact only by contracts. Why don’t we use this approach in Remoting. This is surely possible. And this is better approach.
Let’s review this part of code:

public class RemotedObject:MarshalByRefObject
{
 public void SomeMethod()
 {
  //… some code here
 }
}

Let’s update the code above and add a contract of interacting with it:

public class RemotedObject:MarshalByRefObject, IRemotedObjectContract
{
 public void SomeMethod()
 {
  //… some code here
 }
}

//The interface is defined in a new module
public interface IRemotedObjectContract
{
 void SomeMetod();
}

The remoted object now can be called through IRemotedObjectContract interface. But we should place the interface definition in a separate module, so it will be referenced from the main (RemotedObject) assembly, and also it will be referenced from the client. So now, the client have not to generate a proxy class by hand and attach that module, but it will be enough just to reference the interface-owner module and all is ready. Notice, that nothing should change on the server side. But on the client side for accessing the object we will have something like:

//…
IRemotedObjectInterface tmpRemote = (IRemotedObjectInterface)Activator.GetObject(typeof(IRemotedObjectInterface), tmpUrl);
//Use the retrieved tmpRemote interface to interact with the remote object

So, this is all, and this is really good approach in .Net Remoting.

Sunday, October 26, 2008

Implementing "C/C++" [union] in .Net

To improve performance, the CLR is capable of arranging the fields of a type any way it chooses. When you define a type, you can tell the CLR whether it must keep the type's fields in the same order as the developer specified them or whether it can reorder them as it sees fit.
The System.Runtime.InteropServices.StructLayoutAttribute attribute is used to tell the CLR what to do (how to layout the fields). You can pass LayoutKind.Auto as a value to the above specified attribute to have the CLR arrange the fields, or LayoutKind.Sequential, to have the CLR preserve the given layout, or LayoutKind.Explicit, to explicitly arrange the fields in memory by using offsets.
Microsoft's C# compiler selects LayoutKind.Auto for reference types, and LayoutKind.Sequential for value types.
As was mentioned above, the StructLayoutAttribute also allows you to explicitly indicate the offset of each field by passin LayoutKind.Explicit to its constructor. Then you apply an instance of the System.Runtime.InteropServices.FieldOffsetAttribute attribute to each field passing to this attribute's constructor an Int32, indicating the offset of the field's first byte from the beginning of the instance in bytes. Here is an example:


[StructLayout(LayoutKind.Explicit)]
public struct CUnionAlternate
{
   [FieldOffset(0)]
   byte byteField;
   [FieldOffset(0)]
   short shortField;
}

Friday, October 24, 2008

Optimizing Garbage Collection

Weak References

There is a way to affect the performance of the garbage collection, which is introduced in .Net through the WeakReferences.
When an object points to another one, this is called strong reference, or just reference, as we used to say, and in this case the GC will not collect that obhject as a "garbage". The WeakReferences are kind of references, the objects, which they point to can be collected, and if later they will be accessed throug the WeakReference, the access will fail.
The managed heap contains two internal data structures whose sole purpose is to manage weak references: short and long weak reference tables.
If an object has a short weak reference to itself, and is collected, then it's finalization method doesn't run, and it is being collected immediately. For the long weak reference, when the garbage collector collects object pointed to by the long weak reference table only after determining that the object's storage is reclaimable. If the object has a Finalize method, the Finalize method has been called and the object was not resurrected.
These two tables simply contain pointers to objects allocated within the managed heap. Initially, both tables are empty. When you create a WeakReference object, an object is not allocated from the managed heap. Instead, an empty slot in one of the weak reference tables is located; short weak references use the short weak reference table and long weak references use the long weak reference table.

Generations

Since garbage collection cannot complete without stopping the entire program, it can cause pauses at arbitrary times during the execution of the program. Those pauses can also prevent programs from responding quickly enough to satisfy the requirements of real-time systems.
One of the improvments of the GC is called generations. A generational garbage collector takes into account two facts:

Newly created objects tend to have short lives.
The older an object is, the longer it will survive.

Those collectors group objects by “age” and collect younger objects more often than older objects. All new objects added to the heap can be said to be in generation “0”, until the heap gets filled up which invokes garbage collection. As most objects are short-lived, only a small percentage of “young“ objects are likely to survive their first collection. Once an object survives the first garbage collection, it gets promoted to generation “1”. Objects, which are created after some generation stage are considered as on the “0” generation. The garbage collector gets invoked next only when the sub-heap of generation “0” gets filled up. All objects in generation “1” that survive get compacted and promoted to generation “2”. All survivors in generation “0” also get compacted and promoted to generation “1”. Generation “0” then contains no objects, but, as already was mentiond”, all newer objects after GC go into generation “0”.
Generation “2” is the maximum generation supported by the runtime's garbage collector. When future collections occur, any surviving objects currently in generation 2 simply stay in generation “2”.
Thus, dividing the heap into generations of objects and collecting and compacting younger generation objects improves the efficiency of the basic underlying garbage collection algorithm by reclaiming a significant amount of space from the heap and also being faster than if the collector had examined the objects in all generations.
This is all about garbage collectors, which, I think every, .Net developer must know.

Thursday, October 23, 2008

Understanding the Garbage Collection

One of the most important parts of the .Net framework is the Garbage Collector. Instead of letting you to take care of all the memory you have used, it automatically collects all unnecessary "garbage" when it's time. But how? This is the question I am going to discuss in this article.

The .NET CLR (Common Language Runtime) requires that all resources be allocated from the managed heap. The developer never need to free objects from the managed heap -they are automatically freed when they are no longer needed by the application. When the garbage collector runs, it checks for objects in the managed heap that are no longer needed by the application and performs the necessary operations to reclaim their memory. Because each type of the .Net is described by its Metadata, the garbage collector always knows, how to free up the unnecessary memory.
So, the garbage collector starts its job by locating the roots of the application.

The roots are:

all the global and static pointers
local variable pointers (which are on a thread's stack)
registers, which contain pointers to objects in the managed heap
pointers to the objects from the Freachable queue

The list of active roots is being maintained by the JIT compiler and CLR, and is made acceptable to the GC algorithm.
The GC works in two phases. Let's review them in detail.

The first phase (Marking)

(when the GC starts, it makes an assumption, that all the objects in the heap are garbage)

identification of roots
building the live object graph (GC runs through the roots and identifies live objects using the metadata of the objects)

To avoid the cycling process, if the GC tryes to add an object to the graph, which is already there, then the path, which by the object was found, is being ignored, and no more down-level searches are being made.
When the check for all the roots is done, all the alive objects are being added into the graph, so any other objects, surely, is garbage.

And now comes the second phase (Compacting).

GC walks through the heap linearly looking for garbage blocks of memory.
GC shifts non-garbage objects down in memory (of course, by updating all the references to the moved objects), making them easy reachable by removing all the gaps in the heap

After this phase the pointer is set to the end of the last object in the heap, referencing the place, where new object can be allocated on heap.

Finalization

Whenever a new object, having a Finalize method, is allocated on the heap a pointer to the object is placed in an internal data structure called Finalization queue. When an object is not reachable, the garbage collector considers the object garbage. The garbage collector scans the finalization queue looking for pointers to these objects. When a pointer is found, the pointer is removed from the finalization queue and appended to another internal data structure called Freachable queue, making the object no longer a part of the garbage. At this point, the garbage collector has finished identifying garbage. The garbage collector compacts the reclaimable memory and the special runtime thread empties the freachable queue, executing each object's Finalize method.
The next time the garbage collector is invoked, it sees that the finalized objects are truly garbage and the memory for those objects is then, simply freed.
It is recommended to avoid using Finalize method, unless required. Finalize methods increase memory pressure by not letting the memory and the resources used by that object to be released, until two garbage collections. Since you do not have control on the order in which the finalize methods are executed, it may lead to unpredictable results.
In my next article I will try to explain, how to optimize the GC.

Monday, October 6, 2008

Displaying IFrame contents in DIV containers

There are a lot of situations, when we need to show a web-content from another URL on our web-site. The standard solution for these situations is the IFrame tag. But it's not good approach to use IFrame, because it's resource consuming approach. The better approach, of course, is to use DIV container, and show the needed content in it.
That is not hard in 90% of situations, but there are special cases, when we really need help.

The problem There is a flash movie on the target webpage, which takes flashvars as input, and uses them. But when we use the simple ([DIV].innerHTML = [Iframe].innerHTML) mechanism, we get for example the links not work, or some other issues with those movies.

The thing is, that when we are getting the OBJECT tag content using the innerHTML property, we get it modified, but not the original content. And the problem is that the modified content removes the "flashvars" attribute content.

The solution is to use some kind of ruse. Instead of using the innerHTML property to get all the inner PARAM objects, which represent the parameters, we are getting them directly through the document's getElementsByTag() method call, and passing "PARAM" as an argument. This returns the original list of params, which then can be applyed to the new created object element using document.createElement("object") syntax.

But there is one more situation left. Huge number of these movies use events, which are handled by the webpages. And those scripts are being defined in the header part of the page.
To be sure that those cases are also fixed, we should read all the script tags from the header of the page, and add them (dynamically creating new script tags on our document object, to make the browser to run through the code instructions, not to ignore them) to the header of our document.

So, wish now you will never face this problem.