Thursday, October 13, 2011

All web applications are stateful

I recently read some blog posts (1 and 2) about web application development and the evolution from thin clients to fat clients, from server-centric to client-centric dynamic applications.

Traditionally web clients have been dummy and known basically nothing. Now the demands are the opposite. Servers are stateless or dummy and clients contain all state and logic. Everything should be "web scale". Systems should be able to handle millions of requests per second. Well, of course not everyone is building Facebook or Twitter, but that is the trend.

In my mind all this winds up to one big challenge: where is the state of functionality handled? Is it on server, on the client, or on both? There exists a number of frameworks with different state abstractions from "share nothing" stateless frameworks (e.g. Play) to stateful frameworks (e.g. Context and Lift). They all approach this dilemma in different ways.

So, I decided to make some comments on the matter. Basically, I'm trying to remind that even if application is built with "share nothing" mindset, it does not mean that state does not exist. It is still everywhere and it requires a lot of thought.

I should note that this post contains a lot of questions without answers. I'd be happy to hear your thoughts.

Where is the state of data?

Web applications are not built for nothing. They are most likely modifying data in different ways. It is also likely that the data is shared amongst multiple users. This creates state; the shared state of data.

In my understanding, the basic idea with stateless client-centric approaches is that client basically fetches all the data that is needed for functions and it is manipulated locally. Then at specific points the local data is synchronized with the server and exposed to others. Now, my concerns are:

  1. How much information must be fetched from the server? If database contains millions of records, does the client need all of them or just some subset? How does the client know what is the correct subset?
  2. If the client has only a subset of information in use, how does it know how the "unknown" part of the data affects it?
  3. How does the client know when something changes? It is obvious that client's own changes are easy to synchronize, but how about changes made by others?
  4. How is the changeset fetched? How does the client know what changes to fetch instead of downloading everything just in case?

In more traditional approaches the client has a web page that contains the relevant data given from server and nothing more. Then, when some actions are made a form or similar is submitted. Server handles it and new page with brand new state is returned. Also if there are conflicts during form submit it is the responsibility of the server to do something with it. In any case the state is cleaned and a new page is shown.

But if server is only providing data, collections of objects, how does the client keep track on everything? And if something nasty would happen, how does the client know about it? And what do you do if that is the case? How do you clean the state in the client? Reload perhaps...

Where is the state of business logic?

I believe that one driving force to fat clients is that business logic could be done on clients and it would reduce the amount of needed processing power on server. I may understand this incorrectly, but if such scenario is assumed, doesn't it lead to logic duplication?

The point is that client cannot be trusted and it does not know everything. For instance, it is a basic understanding that data validation made on client must be revalidated on server. It is just too easy to bypass all client validations. So, it quickly leads in a situation that logical decisions should also be revalidated on server.

Also the shared application state among different clients is a challenge. Let's say that the application would be some kind of calendar based reservation system for hairdressers. It allows clients to choose a hairdresser and then create, remove or rearrange reservations online. So, if the client has all the power on manipulating reservations, how is it restricted and controlled?

Server should probably validate the manipulation so that no duplicate reservations can happen and that client cannot forge reservations in some odd manner. Maybe in some case the client cannot find times for the hairdresser she would like to have. So she simply reserves time to someone but forges the hairdresser id. With luck, she might just bypass something and the existing reservation is replaced with new one on the server.

So, basically I have a feeling that with fat clients, developers end up writing a lot of duplicate validation on server. And not only field validation, but also logic validation.

Signaling exceptional states

This topic does not concern only stateless applications, but is a comment about one general issue. If something goes awry, how do you tell it to the client?

This dilemma is universal because almost every request/response-case is built in a same way. Client creates a request and sends parameters with it. Server takes the request and calculates response and returns it. If something goes wrong, exceptions are thrown and logs are filled with stack traces.

So, how is exceptional state returned to client? In my understanding restful approaches normally use HTTP 4xx codes for it. But how do you tell what went wrong and what to do about it?

Signaling multiple state changes

Signaling multiple states changes is also a big issue and is in close relation with exceptional states. How must a request/response be formulated so that it can give different kind of answers?

The thing is that normally the client is expecting to have a certain type of response and nothing else. This is especially challenging in Javascript, because if the return value is JSON, client cannot automatically tell what entity it represents. It must be assumed.

This issue arises especially from the user interface. The more dynamic application is the more it will have multiple sections on the page that may update themselves independently. To support all the changes, multiple type of information is needed.

This issue may sound distant with traditional MVC-based approaches, where it is expected to have one answer to one request. But as soon as you start doing something with component based frameworks, the issue quickly becomes closer. The mindset changes and you start thinking: "Couldn't I just change also that thing?".

So, how is it done? Can one request return multiple responses, or does the client need to make multiple requests? How does the client know what requests to make and how to interpret the responses?

To session or not to session?

The extreme part of stateless approaches are saying that no sessions should ever be used. Session affinity is a bad thing and prevents scalability.

If server based sessions are not used, it automatically creates an obvious question: How does one handle user authentication? How does the server know if the client is allowed to access certain parts of the system?

One solution is just to use cookies that hold user id or similar. But can client tamper that information somehow? When that question arises, it is quickly obvious that the cookie must somehow be encrypted or signed so that only the server knows how to use it.

Now, if server does not keep any record of the cookie's state how do one expire the restricted access? How do you logout? The point is that if server does not keep any state, even with expiration time for the cookie, client can always recreate it and keep logged in forever.

Well, the answer is probably some kind of encrypted and time stamped cookie that is renewed at server whenever it is accessed. And during logout cookie is simply expired at server.

But still, how does one force expiration? The session can always be discarded by the server, but how do you discard the knowledge of the cookie's content in the client? If the client just keeps recreating the last known working cookie, how does the server know that is not valid anymore?

So, what might happen is that developers end up writing some custom database based solution that keeps track of accepted cookies or user's expected state; the state of the session.

A real-life case of the state problem

For the last part, I decided to share a real-life puzzle that I encountered recently regarding state.

I'm currently porting my clients codebase to Context and I encountered a situation that actually made me write this post. The system contains a search page for certain items. In a sense it is just basic stuff with search form and paged results etc.

Now the twist is that when the first results of the search are shown it is possible to create a report for the full set of the matching items. The report itself is nothing fancy really, just some statistics and numbers. But the key point is that generating the report takes time.

If there are a lot of matching items it may take over a minute to finish the report. It is CPU-intensive, database-intensive and IO-intensive. It takes time to crawl the database and crunch the numbers. It would also be nice that not every user is able to start generating their reports in parallel. It would just starve the system.

So all this screams for one thing: a background job. Search page should create a job-on-demand and wait for the results to emerge into existence and finally show them. And if there are multiple jobs they are processed serially. But all this creates a new state, the state of the job on the server. It is not the client doing the crunching, it is the server. So, how is it handled?

In Context that problem is relatively simple. All that is needed is to create a proper object to hold the state and progress of the job, give it to thread pool and associate the object to page state. After that it is trivial to poll the server periodically and check how the report is generating. When everything is finished results are shown in client. The job lives in the page state, and when the web page is closed, the job is also ready for garbage collection.

But how about stateless application? How are following questions solved?

  1. Where does the job live? Where is it kept?
  2. How is the job accessed? How does the client know what is the correct job?
  3. How is the progress of the job tracked? Is it possible to tell user that for instance 40% of the job is done at certain moment of time?
  4. How is it prevented that one user cannot see other users jobs?
  5. How do we know when the results of the job have been consumed and can be discarded on server?

Also, if you are having clustered application more questions must be tackled:

  1. Which server is doing the crunching?
  2. How is the background job invoked?
  3. How is the data accessed if every request goes to random server?

I have no answers to these questions. There may be a number of solutions, but I have a feeling that it requires some kind of external server and that the data must be saved temporarily to the database.

Well, that's about it. I hope that these thoughts are helpful on your journey on web application development whatever programming language or framework you are using.

Monday, August 22, 2011

Clustering Context-application in Cloudbees

Scalability and clustering is normally associated with stateless web applications. Because Context is stateful, I began to experiment, what is needed to serve Context-application from the cloud. I was introduced to Cloudbees, which offers 2 nodes in their free subscription plan.

First thing is important to understand that, in Context, state does not rely on cookies thus it does not rely on http-session. Instead it relies on UUID that is created for each page load and all page update use that token to identify which page should receive the update request.

This means that, from the clustering point of view, all page loads can always go to any random node, but all subsequent ajax-calls on each page must go to the original node. Resource-requests (images, js-files, css-file etc.) can always go to any random node

So, there are at least three ways to handle clustering that suits Context-applications.

1. Session affinity

This is the most basic way to handle the situation. With session affinity all page loads are directed to same node for the same user. It will work, but is not the most "pure" option. The good thing is that session affinity is a known technique and it has support, although Cloudbees does not offer it yet. So, if session affinity is offered everything should be set.

2. Node hinting

This is the way I'd like clustering to happen. In this version each node exposes some kind of token that is sent to web client and if the client sends the same token back during request, then the load balancer will automatically send the request to the originating node.

This would allow each page load request to go to any random node, but all ajax-request would go to the originating node. Unfortunately, in my understanding, this option is not supported.

3. Node proxying

Because the options above were not supported in Cloudbees, I thought something different. Simply, could nodes proxy each other?

Basically this scheme is the same as option 2, but with a twist. When page load is requested, the local IP-address and TCP-port of the node is sent to web client (with hash). And that is sent back to the cloud during ajax-request.

Then the receiving node will check if the ip-address and port belongs to it. If so, call is served. If not, the node will proxy the request to correct node using the local ip-address and port. This was a technique that actually works in Cloudbees.

There is a live demo at Cloubees that demonstrates the behaviour. I had to make a custom http-proxy-filter and if you are interested the example is found here. It is not perfect, but at least shows the idea.

Now, these options has some challenges though. One partical challenge is that depending on traffic new nodes can be brought alive or shutdown. It is always ok to start new nodes, but for shutting them down may be problematic. If the node still contains some users, their pages will suddenly stop from working. So, it basically means that the number of nodes should stay the same.

New features in 0.8.4

Also, new version has been released. Here are the prominent changes.

Extended live class reloading

Version 0.8.3 brought live class reloading to pages and components with limitations. Those limitiations were that creation of new pages required server restart, so that their mappings would be registered. Version 0.8.4 removes that restriction and in development mode new pages (or removals) are discovered at runtime.

This change means that for most of the time no server restarts are needed. Complete documentation is here.

Path- and Request-param-mapping

This version also brings support for mapping path- and request-parameters as view component properties or as methods. This happens with two annotations @PathParam and @RequestParam.

Example:
http://www.example.com/myview/10?search=contextfw

@PageScoped 
@View(url="/myview/<id>")
public class MyView extends Component implements ViewComponent {

  @PathParam
  private Long id;

  @RequestParam
  private String search;

  @Override
  public void initialize(ViewContext context) {
    System.out.println(id + ":" + search); // => prints 10:contextfw
  }
}
Complete documentation is here.

JSONP-embedded mode with RequestInvocationFilter

RequestInvocationFilter is more esoteric and advanced feature and is not needed in everyday use. But in essence it is way to create filtering to page initialization and updates in similar manner as regular servlet filter. The difference is that this filter does not touch resources as images or css-files.

The filtering was implemented to enhance embedded mode usage. In normal embedded mode same origin policy is applied, but modifying contextfw.js-file and applying invocation filter, it is possible to embed pages in JSONP-mode thus pages can be embedded everywhere.

At this moment JSONP-mode is experimental, but if you are interested in it, I'm happy to share my experiments so far.

Documentation about embedded mode is here.

Saturday, July 23, 2011

Context 0.8.3 gets live class reloading and other things

This is really exiting. Context 0.8.3 has just been released and it just keeps getting better. Live class reloading has been a long dream and one weekend I decided that I will now try to make it happen.

I have used JRebel for code-hot-swapping and it has worked great. But because it is a separate commercial product, it cannot be the only option.

So after finding information about class loading etc, about eight hours later live class loading became alive. :-D And in quite elegant way actually.

Now, in Context every page load gets own set of objects to handle the page state. So, what I actually did was to attach a throwaway class loader to that process. When page is reloaded and class changes are detected a new class loader is created for new page loads.

The especially nice thing about this approach is that even classes from service-layer can be reloaded.

Basically the only concern is that live class reloading creates a two set of classes; those that are reloadable and those that are not. If class can be handled purely by annotations and Guice then it is probably reloadable.

This separation means that classes that are non-reloadable must not depend on reloadable classes. Otherwise there will be linkage errors. But with reasonable organizing that should not be a problem.

Remote method interception and data validation

In Context page update requests are translated into component method calls which creates a good question. Can that call be intercepted in AOP-like manner? This is important, because method execution may require certain privileges or parameter validation.

The normal approach would be to create proxies that handle the AOP-stuff. This is however, problematic because it prevents using new-operator and I have never liked that restriction.

Now, because the remote call is translated from certain url to method call, it is actually called via reflection. So, the solution was simple. I simply added a new method to LifeCycleListener-interface, that is called every time when method should be invoked. All data validation and other checks can be done in that method.

Tuesday, June 28, 2011

Secure way to recover lost password or user name

Most web applications that needs authentication has a common requirement which is the ability to recover lost password or user name. This has been discussed in many blog-posts before, for instance here.

In this post I'm presenting a secure way of revovering password using Context Framework and its key feature page scope. The goal is to make recovery resistant against data forging or data interception. That is, the system should resists false password recovery attempts or stealing victims emails.

In short, page scope is an exclusive data area on server that is unique to each opened page on user's web browser. That is even if same page is opened in multiple browser tabs the page scope is unique for each of them. Page scope is independent from session and does not use cookies. The security of the scope relies on random key (UUID) that is generated for each page.

In the scenario I have following assumptions:

  1. User has a user name and a password
  2. User has an email that she trusts and is registered to the system
  3. User has forgotten her credentials and needs reminder and if necessary a way to change password

In login page there exists a page component called Authenticator. It provides following features:

  1. A login with user name and password
  2. Credential revocation with email

If user finds herself in situation where she has forgotten her username, password or both she needs to user her email to recover them. The Authenticator contains two additional input fields (may be hidden at first). First one is for user's email address and second is for a security key for later use.

User enters her email and requests the Authenticator to send recovery email. The Authenticator receives the request and stores the email to page scope. To be more accurate the component itself is stored in the page scope, so storing the email is a simple as setting a class property.

Then the email is forwarded to Authentication service. The service then creates a random key, using for instance UUID. Then it checks whether the email exists in the system. If email is found the service sends a message to the email, containing the random key and informs that the key should be entered to another input in the page where the request was made.

If email is not found, then email simply contains information about that.

After sending the message, service returns the created key to the Authenticator. The key should be returned even if no proper email is found. In this way user cannot start guessing what emails might exist in the system.

Service could also send an exception or whatever error code. The important thing is that Authenticator must behave exactly the same as email would be valid.

Then what happens is that the Authenticator stores the key in to the page scope and begins to wait for the user to enter the key.

When user has received the email, she copies the key and enters it to the security key field. Authenticator receives it and checks that the entered key and the stored key match. If they don't an error message is shown. If they do, component knows that the email is valid.

The Authenticator then asks from Authentication service for the credentials that match the given email, and gets user name as return value. Then it displays the user name as plain text. If user is now able to remember her password, she simply logs in and everything is ready.

If she does not remember a password, Autheticator then shows an input field to create new password. The password can be checked for complexity etc. but in the end user sends a new password to server. The Authenticator receives it and asks the Authentication service to change password for the user name it received during recovery process.

After that user can be logged in automatically or require to login manually. At this point the process is ready.

Why is this scenario safe?


Email and security key are stored on server and they are not requested from user afterwards, so neither of them can be fabricated during later calls.

Also, intercepting request email is useless, because the random key is usable only on the page where the original request was sent.

Page scope lives only as long as the page is alive. As soon as page is unloaded or expires for some other reason, the security key is no longer usable at all.

Other Benefits


This scenario does not require any secret keys or hashes to ensure that requests are not forged. The contents of the sent email stays simple. Revocation process also do not require external pages to process it. Everything is handled on the same component. Process also does not need cookies because revocation is done outside of session.

This example demonstrates what benefits a stateful web framework has. It provides means to hide information form clients and store them on server so that information cannot be tampered with.

If you think that this scenario is unsafe, I would be happy to hear some comments.

In this Gist is a little snippet as Java-code to clarify what the component and service might look like. All templates and possible javascript-parts are exluded in the example.

Also check out the new features of Context that the 0.8.1 has to offer.

Update: 2011-06-29


I was given a comment on DZone that I should emphasize more that this has nothing to do with protecting entire session and that is true. This blog post concentrates only password recovery process where possible attacker might want to forge or circumvent the process some other way.

For example, an attacker might initiate password recovery with her own email, but forge later requests in such way that process continues with someone else's email or she could simply create a fake process altogether.

For protecting entire session and HTTP connection I recommend to to familiarize yourself with OWASP Top 10 risks and in this case session hijacking.

Version 0.8.1 released

0.8.1


Calling javascript-functions


Previously remote javascript had to be done using RemoteCallBuilder that was injected to component. This was a bit cumbersome approach.

The RemoteCallBuilder is now removed and replaced with new annotation @ScriptElement that can be used for properties or method return values. To incorporate that three classes were introduced, Script, FunctionCall and ComponentFunctionCall.

Speed enchancements


During buildphase a complex metamodel is created from each component. This was refactored to more efficient model, which reduced the number of checks and other Map-accesses to minimum.

Resource cleaning enhancements


Previously, in development mode, during each page access all resources were reloaded to reflect possible changes which was wasting cpu-power. Now resources are reloaded only when something really changes and speeds up the system.

XSL-post processing


Because templates are XSL-based they can be modified programmatically. For that a XSL-postprocessing feature was added which enables the XSL be modified and also meta-data can be read.

0.8.0


Package and class refactoring


This version has two main functions. First it was a big refactoring version where packages and class names were refactored to be more conicse. Also System configuration model was modified to compeletely different.

Resource loading from components


This version also added a possibility to return resources from components. They can be used to return for instance JSON for javascript handling or for example images or PDF-files for other purposes.

Server push/Comet/delayed updates


In this version I finally managed to create a model which enables to use long polling as a kind of server push. From the framework point of view it is called delayed method call.

Context does not itself provide any comet-handling because it is dependent on web container. It has been tested with Jetty continuations but should be compatible with other types too.

Page flow filter


Because Context is highly stateful and each page call consumes some memory it is possible to exhaust server memory with enough calls. For this I added a page flow filter that can be used to throttle the amount of calls if necessary.

Integration with JRebel


Currently web development using JRebel is quite convinient. In development mode there is practically no need to restart server with few exceptions.

  1. New views and URL changes: When new view is added or view url is changed it is required to restart server because it is currently impossible to reload servlets in Guice Servlet.
  2. Singleton scoped entities: There does not exists a good Guice plugin in JRebel and if Guice managed entities have dependency changes they are not reloaded automatically.

For more information visit www.contextfw.net.

Sunday, February 20, 2011

Screencast - Introduction to Context

Now that Context has reached version 0.7. I finally decided that it needs a proper screencast to show, how it is actually used. And after many hours, I have a proper screencast to offer.

The introduction is divided into three parts. where the parts are:

  1. Explanation of how Context works, and a presentation, how to add a page to the system
  2. Creating and adding a component, and also creating interaction between server and client. This part is a bit long, but it goes through entire phase of component creation and usage, so there is lots of stuff to cover.
  3. Adding interaction between components on same page

I hope you enjoy them :-) For more information simply go to www.contextfw.net. There you will find all documentation and ways to contact me for further information.

Update: 2011-09-30 - Changes in version 0.8.3/0.8.4

Since these sceencasts were made lot has happened and few things should be noted. First, a minor thing is that annotation @RemoteMethod is now simply @Remoted.

Also, in the sceencasts there are lot of server restarts, but 0.8.3/0.8.4 has introduced live class reloading and the screencasts can be completed without any restarts.

Sunday, January 30, 2011

From 0.6 to 0.7

It has been a long time since the last release. There has been many changes from 0.6 to 0.7.

Context has finally matured to such state that it can really be used. At this point it needs projects that try to use it and try to push it boundaries so that it will evolve into 1.0.

Here are the biggest changes what has happened:

Component model simplified


The earlier component model contained quite many classes and interfaces such SimpleElement, CElement, EnhancedElement and EnhancedSimpleElement. These classes were needed when Context was mixing non-annotation besed control with annotation based one. Now that has evolved so that all those classes were replaced with a class Component and one annotation @Buildable

Attribute handling


In the earlier version there was one class to handle all attribute conversions. Now that has been replaced with serializer for each java-type. The same serialializer are also used in JSON-serializing when needed.

Name and package refactorings


Many packages and class/annotation/interface names were refactored to more clearly reflect their intended purpose.

Sarissa dependency removed


Earlier a library called Sarissa were used to parse page updates. This dependency had a nasty habit to conflict with jQuery and I found that I really did not need its full capabilies. Sarissa was removed and replaced with much simpler solution.

What's new


Lifecycle listener


Context now contains a method that can be used to track and interrupt the lifecycle when page is opened or udpated. This is useful for opening and closing database connections or otherwise modify the flow in exceptional circumstances.

Property provider


In previous version framework read properties from System properties. Now, this can be intercepted if needed.

Resource loading from jars


In previous version all dynamic resources needed to exist in the project itself. With improved resource loading mechanism resources are also loaded from jar-files and has simplified the project configuration.

Buildins


Buildins is a similar concept to mixins. Buildins are classes that can be mixed during build process as they were part of the buildable class itself. This is useful for instance providing external information such URLs or meta-information that does not belong to the original class itself.