Search

Thursday, 10 January 2013

Sirius: core Win32 functionality support

Once we've done with infrastructure it's time to work on development itself. Now I'll start working on the functionality performing interaction with Win32 objects. In this article I'll describe basic preparations for Win32 interactions as well as I'll describe the core functionality to capture the window on screen and perform some manipulations with it. And of course this activity is done in the scope of Sirius development. So, it would be bound to the entire architecture of the platform.

Scope of work

For Win32 interaction I'm going to use JNA library. Server side should contain wrappers on JNA functions. The clients would use that wrappers via SOAP interface. And all the clients will be encapsulated under major client which should be main entry point for using client library. Schematically the entire structure can be represented with the following diagram:

As it's seen on the diagram I'm going to use User32.dll and Kernel32.dll libraries. From the Java code they would be accessible via JNA. Further development steps are:
  1. Create classes which wrap User32.dll and Kernel32.dll calls
  2. Modify the wrappers to make them work as web-service endpoints
  3. Create core functionality which implements utility functions like search for windows
  4. Generate the client code
So, further we'll do the above steps and something more.

Wrapping native libraries

Updating dependencies

First of all we should include JNA library into Server project. For this purpose we should update the project pom.xml file with the following content:

<dependency>
 <groupId>net.java.dev.jna</groupId>
 <artifactId>platform</artifactId>
 <version>3.5.1</version>
</dependency>
It should be put into dependencies section.

Actually, we're mostly interested not in JNA but rather it's platform library which uses JNA as the dependency. Once it's done we can use the library and write something like:

import com.sun.jna.platform.win32.Kernel32;

....
Kernel32 kernel32 = Kernel32.INSTANCE;
int pid = kernel32.GetCurrentProcessId();
Good! Let's move on.

Trick with fast code generation

In order to make proper WSDL interface I need the classes which have the same interface as Kernel32 and User32 but they only delegate execution to Kernel32 and User32. In other words, now I can use instructions like:

import com.sun.jna.platform.win32.Kernel32;

....
Kernel32 kernel32 = Kernel32.INSTANCE;
int pid = kernel32.GetCurrentProcessId();
where I use Kernel32 directly. But I need to have a class like:
class Kernel32Lib {
 private Kernel32 kernel32;
 
 public Kernel32Lib(){
  kernel32 = Kernel32.INSTANCE;
 }
 
 public int GetCurrentProcessId(){
  return kernel32.GetCurrentProcessId();
 }
}
As you can see I'm just wrapping the calls to the actually used interface. Such wrapper class can be applicable for making web service endpoint from.

OK. No problem. I can make numerious copy/pastes and surf through entire class adding necessary fields and resolving all the types. Almost no problem ... Except one: It takes too long and it's boring!!!

So, how to generate Java wrapper on some interface delegating all execution to the wrapped instance? For this purpose I'll use Eclipse and it's code generators. All magic is done in several steps.

Firstly, create new class which implements the com.sun.jna.platform.win32.Kernel32 and before saving it set Inherit abstract methods checkbox:

Thus, we have stubs for Kernel32 interface.
NOTE
The created class is temporary and will be removed in the future so there's no need to invent any specific name for it
Eventually, we'll have the class like:
public class Kernel32Temp implements Kernel32 {

 /**
  * 
  */
 public Kernel32Temp() {
  // TODO Auto-generated constructor stub
 }

 /* (non-Javadoc)
  * @see com.sun.jna.platform.win32.Kernel32#FormatMessage(int, com.sun.jna.Pointer, int, int, java.nio.Buffer, int, com.sun.jna.Pointer)
  */
 @Override
 public int FormatMessage(int dwFlags, Pointer lpSource, int dwMessageId,
   int dwLanguageId, Buffer lpBuffer, int nSize, Pointer va_list) {
  // TODO Auto-generated method stub
  return 0;
 }
 ...........
}
where there're a lot of empty methods.

After that we create our target class Kernel32Lib which should extend previously created Kernel32Temp class. So, initially we have a class like:

public class Kernel32Lin extends Kernel32Temp {
 public Kernel32Lin() {
  // TODO Auto-generated constructor stub
 }
}
After that we should navigate to the Source > Override/Implement methods and set check marks for all the methods of parent class. We should see the screen like:
After clicking OK our class will be filled with numerious methods like:
 @Override
 public int FormatMessage(int dwFlags, Pointer lpSource, int dwMessageId,
   int dwLanguageId, Buffer lpBuffer, int nSize, Pointer va_list) {
  // TODO Auto-generated method stub
  return super.FormatMessage(dwFlags, lpSource, dwMessageId, dwLanguageId,
    lpBuffer, nSize, va_list);
 }
The final steps here are:
  1. Add Kernel32 variable as the private member and initialize it in constructor. You'll get the code like:
     protected Kernel32 kernel32;
     
     public Kernel32Lib() {
      kernel32 = Kernel32.INSTANCE;
     }
    
  2. Replace all entries of the word super with kernel32 (use editor replacement functionality invoked with Ctrl + F).
  3. Replace all generated comments (replace the most frequently used ones with empty string) just to cleanup the code from the garbage
  4. Add @WebService annotation to the class
That's it. We have fully operational wrapper on Kernel32 interface which can be used as service endpoint. Repeat the same steps for User32 JNA interface to create the User32Lib class which is also used as service endpoint.

Adding new server endpoints

Once I have the classes which can be used as an endpoint for Server I can add them. For this purpose I'll include appropriate records into modules.csv configuration file (detailed explanation regarding the configuration file can be found in my previous posts). The updates are:

Endpoint                                    ,Class                                      ,Package
...
http://${HOST}:${PORT}/win32/core/kernel32  ,org.sirius.server.win32.core.Kernel32Lib   ,Local
http://${HOST}:${PORT}/win32/core/user32    ,org.sirius.server.win32.core.User32Lib     ,Local
Now our server have Win32 related endpoints.

Cleanup the classes

Everything could be good unless there were a lot of errors during endpoints initialization. Here is some example of the error message which is recieved:

javax.xml.ws.WebServiceException: Unable to create JAXBContext
 at com.sun.xml.internal.ws.model.AbstractSEIModelImpl.createJAXBContext(Unknown Source)
 at com.sun.xml.internal.ws.model.AbstractSEIModelImpl.postProcess(Unknown Source)
 at com.sun.xml.internal.ws.model.RuntimeModeler.buildRuntimeModel(Unknown Source)
 at com.sun.xml.internal.ws.server.EndpointFactory.createSEIModel(Unknown Source)
 at com.sun.xml.internal.ws.server.EndpointFactory.createEndpoint(Unknown Source)
 at com.sun.xml.internal.ws.api.server.WSEndpoint.create(Unknown Source)
 at com.sun.xml.internal.ws.transport.http.server.EndpointImpl.createEndpoint(Unknown Source)
 at com.sun.xml.internal.ws.transport.http.server.EndpointImpl.publish(Unknown Source)
 at com.sun.xml.internal.ws.spi.ProviderImpl.createAndPublishEndpoint(Unknown Source)
 at javax.xml.ws.Endpoint.publish(Unknown Source)
 at org.sirius.server.Starter.startEndPoints(Starter.java:85)
 at org.sirius.server.Starter.main(Starter.java:150)
Caused by: java.security.PrivilegedActionException: com.sun.xml.internal.bind.v2.runtime.IllegalAnnotationsException: 1 counts of IllegalAnnotationExceptions
There are 2 major cases which cause such problem:
  • The parameter is not the type but interface
  • There're several classes which have the same name. Despite they're located at the different packages or classes the WSDL namespace is the same for them by default
Well, the second problem can be resolved by adding the @WebParam annotation with specified targetNamespace value. But I made more fulminant decision. Actually, the methods containing such types usually store some references to internal objects or they're some sort of interface which hardly can be set up from client side (e.g. I can hardly imagine how it is possible to create static method and pass the reference to it as the parameter within SOAP query). So, why do I need the clients for them? So, I decided to remove such problematic methods. The real cases when I need them are rather about internal operations and explicit use of Kernel32 and User32 is acceptable.

After removing such unnecessary methods we have working endpoints providing SOAP interface to Win32 core operations.

Writing utilities code

The concept of windows location

We have core functionality. Thus we can create code level operating with Win32 objects based on core level. And one of the core functionality is window location. It means that before we doing anything with the window we should be able to find it. In Win32 we should identify the HWND of needed window. After that we can send signals to it. But how are we going to locate the window and how it should be uniquely identified?

The major unique identifier of the window is HWND but it should be searched for. The other attributes I can look for are:

  1. Window class - every window in the Win32 has the window class
  2. Window text - probably the most visible part of the window which can be found out without anyspecial tools
  3. Window index - the order number of the window among the same ones having the same class and caption. Many windows can have the same class as well as the same text. So, index can identify which one is exactly what we're looking for.
  4. Parent window - this is useful to decrease search scope by some parent window. So, that would be another attribute
  5. Others - some specific windows can have additional attributes which uniquely identifies them. E.g. dialog box controls usually have resource ID which is rarely changes and it is more or less unique
So, when we look for some window we operate with the combination of the above attributes. So, that would be reflected with the code. For this purpose I've created class for storing window location attributes. It has the following code:
package org.sirius.server.win32;

import com.sun.jna.platform.win32.WinDef.HWND;

public class Win32Locator {

 private HWND hwnd;
 private HWND parent;
 private String winClass;
 private String caption;
 private int index;
 
 public Win32Locator() {
  hwnd = null;
  parent = null;
  winClass = "(.*)";
  caption = "(.*)";
  index = 0;
 }

 public final HWND getHwnd() {
  return hwnd;
 }

 public final void setHwnd(HWND hwnd) {
  this.hwnd = hwnd;
 }

 public final String getWinClass() {
  return winClass;
 }

 public final void setWinClass(String winClass) {
  this.winClass = winClass;
 }

 public final String getCaption() {
  return caption;
 }

 public final void setCaption(String caption) {
  this.caption = caption;
 }

 public final int getIndex() {
  return index;
 }

 public final void setIndex(int index) {
  this.index = index;
 }

 public final HWND getParent() {
  return parent;
 }

 public final void setParent(HWND parent) {
  this.parent = parent;
 } 
}
That would be the structure we'll pass search parameters to.

Major search function

For the windows search functionality we should have additional endpoint so it makes the necessity to create another class containing utility functions. So, let's create new org.sirius.server.win32.Win32Utils class and add new method. We'll get the code like:

package org.sirius.server.win32;

import com.sun.jna.platform.win32.User32;
import com.sun.jna.platform.win32.WinDef.HWND;

public class Win32Utils {
 
 public Win32Utils() {
  // TODO Auto-generated constructor stub
 }

 public HWND searchWindow(Win32Locator locator){
  User32 user32 = User32.INSTANCE;
  user32.EnumWindows(null, null);
  return null;
 }
}
The highlighted code is the skeleton for our search method which should return the handle of the first window matching the search attributes. The core method which should do all the magic for us is highlighted with red. It's EnumWindows function. It should walk through all the windows calling specified callback enumeration procedure which should perform the comparison. For now I just set all parameters as nulls. But that's just for a while.

Enumeration procedure

Now it's time to make the core part of windows search. Firstly, let's prepare the skeleton for our enumeration procedure. In JNA library that should be the class implementing WinUser.WNDENUMPROC interface. Actually, this class should implement the callback method. If it returns true the EnumWindows switches to the next window. If false, the enumeration ends. So, I'll add the WNDENUMPROC class inside the Win32Utils class as no one else uses it. Initially it looks like:

public class Win32Utils {
 ..........
 
 public class WNDENUMPROC implements WinUser.WNDENUMPROC {
  
  @Override
  public boolean callback(HWND arg0, Pointer arg1) {
   return true;
  }
 }
 ..........
}
But it's not enough to implement the method. We should pass the search criteria into the procedure. Also, we should be able to return the HWND in case it was found. So, we'll add the private variable of Win32Locator type, create getter for it and add the constructor which should initialize the locator. The updates are as follows:
public class Win32Utils {
 ..........
 
 public class WNDENUMPROC implements WinUser.WNDENUMPROC {
  
  private Win32Locator locator;
  
  public WNDENUMPROC(Win32Locator locator){
   this.locator = locator;
  }
  
  /**
   * @return the locator
   */
  public final Win32Locator getLocator() {
   return locator;
  }
  
  @Override
  public boolean callback(HWND arg0, Pointer arg1) {
   return true;
  }
 }
 ..........
}
Additionally we'll update WNDENUMPROC class with additional variable. It's the variable which stores current index of the found window. The updates are:
 private int currIndex;
 private Win32Locator locator;

 public WNDENUMPROC(Win32Locator locator){
  this.locator = locator;
  currIndex = 0;
 }
Now we are ready for callback procedure implementation. We'll do it in several steps:
  1. Compare window caption:
      @Override
      public boolean callback(HWND arg0, Pointer arg1) {
       User32 user32 = User32.INSTANCE;
       int length = user32.GetWindowTextLength(arg0) + 1;
       char buf[] = new char[length];
       
       user32.GetWindowText(arg0, buf, length );
       String text = String.valueOf( buf ).trim();
       
       if( !text.matches( locator.getCaption() ) ){
        return true;
       }
       ...............
       return false;
      }
    
  2. Compare window class name:
      @Override
      public boolean callback(HWND arg0, Pointer arg1) {
       ...............
       buf = null;
       buf = new char[128];
    
       user32.GetClassName(arg0, buf, 128);
       String clazz = String.valueOf( buf ).trim();
       
       if( !clazz.matches( locator.getWinClass() ) ){
        return true;
       }
       ...............
       return false;
      }
    
  3. Compare index. If it matches then go further. If not the current index number is incremented. The code is:
      @Override
      public boolean callback(HWND arg0, Pointer arg1) {
       ...............
       if( currIndex < locator.getIndex() ){
        currIndex++;
       }
       else {
        locator.setHwnd( arg0 );
       }
       ...............
       return false;
      }
    
  4. Finalizing processing. When all checks are passed it means we've found required window. So, we should fill the locator with detailed window information and return false. So, we'll end up with the callback by the following code:
      @Override
      public boolean callback(HWND arg0, Pointer arg1) {
       ...............
       if( locator.getHwnd() == null ) return true;
       
       buf = null;
       locator.setCaption(text);
       locator.setWinClass(clazz);
       
       return false;
      }
    
Now the callback function is ready.

Getting things all together

The final thing to be done is to adjust the above callback function to searchWindow method. The final version looks like:

 public HWND searchWindow(Win32Locator locator){
  User32 user32 = User32.INSTANCE;
  
  WNDENUMPROC enumProc = new WNDENUMPROC(locator);
  Pointer pt = Pointer.NULL;
  
  user32.EnumWindows(enumProc, pt);
  return enumProc.getLocator().getHwnd();
 }
Just for the purpose of spot testing I've checked the above functionality with the following code:
 public static void main(String[] args){
  Win32Locator locator = new Win32Locator();
  locator.setWinClass("Notepad");
  Win32Utils utils = new Win32Utils();
  HWND hwnd = utils.searchWindow(locator);
  System.out.println("" + hwnd );
 }
It looks for Notepad window. If it finds it it shows it's hash code. Otherwise, the output text is "null".

In order to complete the work on the server side we should do the following:

  1. Annotate the Win32Utils class with @WebService annotation
  2. Update default configuration file with the following entry:
    http://${HOST}:${PORT}/win32/utils          ,org.sirius.server.win32.Win32Utils             ,Local
    
After that we have new endpoint targeted to the http://<hostname>/win32/utils URL. And WSDL is viewed at the http://<hostname>/win32/utils?wsdl.

Generating client code

Now we can proceed with Java client generation. It was described in my previous posts in details so I won't stop here again.

Key thing here is that the client now contains too much classes and there's no centralized way to use it. It would be convenient to have some client classes which could be a containers for generated proxy classes for centralized access. For this purpose we should add extra classes which should fit the following structure:

So, all that's left to be done is to add 2 classes:
  • Win32CoreClient:
    package org.sirius.client.win32.core;
    
    import org.sirius.client.win32.core.kernel.Kernel32LibProxy;
    import org.sirius.client.win32.core.user32.User32LibProxy;
    
    public class Win32CoreClient {
     
     private Kernel32LibProxy kernel32;
     private User32LibProxy user32;
    
     public final Kernel32LibProxy kernel32() {
      return kernel32;
     }
    
     public final User32LibProxy user32() {
      return user32;
     }
     
     public Win32CoreClient() {
      kernel32 = new Kernel32LibProxy();
      user32 = new User32LibProxy();
     }
    }
    
  • Win32Client:
    package org.sirius.client.win32;
    
    import org.sirius.client.win32.core.Win32CoreClient;
    import org.sirius.client.win32.utils.Win32UtilsProxy;
    
    public class Win32Client {
     
     private Win32CoreClient core;
     private Win32UtilsProxy utils;
     
     public Win32Client() {
      core = new Win32CoreClient();
      utils = new Win32UtilsProxy();
     }
     
     public final Win32CoreClient core() {
      return core;
     }
     
     public final Win32UtilsProxy utils(){
      return utils;
     }
    }
    
Such client code can be the basis for higher-level client abstractions representing various window classes and all related operations. At the same time we've wrapped the core functionality with single access interface. Now we can make calls like:
Win32Client client = new Client();
int pid = client.core().kernel32().getCurrentProcessId();
Thus we shouldn't create multiple instances of proxy classes or jump to multiple variables. The entry point is only one.

Summary

All right. This time we've got the following results:

What was plannedDone/FailedComments/What should be done
Prepare server functionality for Win32 interactionDone 
Prepare server functionality for windows searchDone 
Update client side with Win32 functionality supportDone
All planned things were done. Next steps would be about expanding functionality to different programming languages as well as it's time to make more complicated client functionality supporting interaction with windows. But that's a matter of further posts. For now we're done.

No comments:

Post a Comment