Introduction
Cucumber is not the only engine supporting natural language instructions. It's just one implementation of natural language instructions interpreter. The actual language to write tests with is called Gherkin. And it has different implementations adopted to different programming language. Thus we have:This list isn't complete as there can be many other similar engines which are simply less popular. All of them have some common set of supported features but there're some restrictions and abilities specific to the actual engine. So, the aim of this post is to collect useful features for each listed above engine and present it in some comparable form. Key features to be mentioned are:
- Documentation availability
- Flexibility in passing parameters
- Auto-complete
- Steps, scenario and feature scoping
- Complex steps
- Hooks and pre-conditions
- Binding to code
- Formatting flexibility
- Built-in reports
- Input data sources support
Grade | Criteria |
---|---|
0 | No support at all |
1 | Functionality exists but with serious restrictions |
2 | Major functionality exists |
3 | Full-featured support |
Documentation availability
One of the key factors demonstrating the maturity of the engine is the documentation availability and it's completeness. Indeed, when we start using new engine the first thing we usually do is documentation reading where we can take some examples, list of features etc. Actually, the documentation is the part of software product (this is one of the differences between application and software product), so if there's a lack of documentation it indicates that product isn't complete. Additional source of documentation is various online materials we can find in Internet from specialized resources, blogs etc. So, when I estimate the grade of documentation availability I check the following criteria:- Documentation is available in general (it makes grade 1 at once)
- Every feature is described and has examples (if it fits it makes grade 2)
- There're additional well-grown resources (forums, blogs, user groups) where we can find additional information about the engine
- Cucumber:
- Freshen - honestly speaking I didn't find any specialized resourse dedicated to freshen only. Most likely it can be discussed in the more general forums dedicated to BDD in general.
- JBehave:
- JBehave - BDD LinkedIn group
- SpecFlow:
- SpecFlow: Pragmatic BDD for .NET
- SpecFlow Google group
- Behat:
- Behat/Mink Users LinkedIn group
Engine | Documentation availability |
---|---|
Cucumber | 3 |
Freshen | 2 |
JBehave | 3 |
NBehave | 1 |
SpecFlow | 3 |
Behat | 3 |
Flexibility in passing parameters
Passing parameters in the natural language instructions is quite frequent case when we try to re-use existing instructions as much as possible. There can be several cases where we should make our instructions flexible to different variations. They are:- Passing the actual value - e.g. we have a code entering some text into the text field. So, we can create common method where the parameter will specify the text to enter
- Small variations of the instruction while the action to perform is the same - typically we need either to use some shortened form of the instruction or we just have to re-phrase existing expression to make the overall test more readable
- Passing complex structures - sometimes we have to pass some set of data groupped under some specific entity. E.g. we have to create some order filling all necessary fields. So, if we want to code it we should pass some structure containing all necessary data. Another case is when we just pass multiline text.
- Regular expressions support
- Tables support
- Multi-line input support
- Extra features - features related to parameters passing which isn't common for all engines but adds some additional syntax sugar to the tests
Engine | Regular expressions support | Tables support | Multi-line input support | Extra features |
---|---|---|---|---|
Cucumber | 3 | 3 | 3 | 0 |
Freshen | 3 | 3 | 3 | 0 |
JBehave | 2 | 3 | 0 | 3 |
NBehave | 2 | 3 | 0 | 2 |
SpecFlow | 2 | 3 | 3 | 2 |
Behat | 3 | 3 | 3 | 0 |
Then /the main page is open/And it will match the following phrases:
Then the main page is open Then I should see the main page is openSo, both expression fit regular expression. In JBehave, NBehave and Specflow the same result can be achieved either by pattern variants or by wildcards. Both options can help in getting desired result but they look more complex than it's done for scripting languages. That's why JBehave, NBehave and Specflow have grade 2 for regular expressions support But all those engines equally good at support of tables. That's why corresponding column has the highest grade for all engines.
Auto-complete
I'd say it's one of the most useful feature while writing tests using natural language instructions. Key problem is that natural language instructions used for BDD tests are artificially natural. The actual code is still behind them. So, we still have to keep the phrases in our memory which is difficult for large project where the number of such instructions amounts to hundreds. As the result we're at risk of having multiple phrases expressing the same things but in a bit different form. The entire testing solution grows drammatically due to that. Also, it's more effective if we build our tests by the bricks when we just have to select phrases we need rather than invent new ones from time to time. For this purpose the auto-complete feature is really helpful. You just have to type some key parts of entire phrase and select the most appropriate option. If there's such ability it's definitely great. Unfortunately this feature is quite rare and mostly represented as some IDE plugin for writing stories (it doesn't bind to the actual code or regular expressions we use for binding). So, partially it's supported by Eclipse plugins. At the moment I know only SpecFlow plugin to Visual Studio which fully supports autocomplete. So, support table for this feature looks like:Engine | Auto-complete support |
---|---|
Cucumber | 1 |
Freshen | 1 |
JBehave | 1 |
NBehave | 1 |
SpecFlow | 3 |
Behat | 1 |
Steps, scenario and feature scoping
There are two major cases when we need such feature:- We want to run only some sub-group of tests - this is typically done using tags or any other meta information both on feature and scenario level
- The same instruction should call different code depending on what the functionality under test is - this feature is rather related to scoped steps when some step definitions are available only if we run some specific tags.
Engine | Tagging support | Scoped steps support |
---|---|---|
Cucumber | 3 | 0 |
Freshen | 3 | 0 |
JBehave | 3 | 1 |
NBehave | 0 | 0 |
SpecFlow | 3 | 3 |
Behat | 3 | 0 |
Complex steps
Each business functionality consists of many smaller actions performing some generic operations. The higher level of test abstraction is the more generic operations are needed to do necessary action. At the same time it's inconvenient to copy/paste huge number of lines. It's useful to have some high level instruction which just groups generic actions into some bigger formation. Well, it's quite frequent functionality as well as it always can be done using engine libraries. So, if we talk about grades, all engines have such functionality. As the result they have grade of at least 2. But using engine classes isn't very good inside the test code. It's much more convenient to call Givens, Whens, Thens explicitly from the code just like it's done in Cucumber. For instance:Given /I'm on the search page/ do Given "I'm logged into the system" When 'I click on the "Search" link' And 'wait for page to load' Then 'I should see the "Search" page is open' endSuch constructions make additional abstraction layer where we write almost no code but text instructions. After some time such layer appear anyway. And the earlier we switch to that layer the less work we should do to create new tests (we don't need to write the code implementing text instructions). The engines for scripting languages like Ruby, Python, PHP support such functionality (proof for Cucumber, proof for Freshen, proof for Behat). Additionally, JBehave has specific annotation called "Composite" which is very helpful here. So, from above information we can make the following table:
Engine | Composite steps |
---|---|
Cucumber | 3 |
Freshen | 3 |
JBehave | 3 |
NBehave | 2 |
SpecFlow | 2 |
Behat | 3 |
Hooks and pre-conditions
Of course, a lot of tests usually require some initial state before they start running. Typically such actions are implemented using backgrounds. Background is some scenario which is running before each scenario in the feature. Usually there's only one background per feature. Backgrounds are supported by most of the engines I make overview for. The exceptions here are NBehave and JBehave (I was quite surprized while seing that fact because it's one of the basic features). However you can find how backgrounds are used in other engines by the following links:- Backgrounds in Cucumber
- Backgrounds in Freshen
- Backgrounds in SpecFlow (actually it refers to Cucumber documentation)
- Backgrounds in Behat
Engine | Backgrounds | Hooks |
---|---|---|
Cucumber | 3 | 3 |
Freshen | 3 | 3 |
JBehave | 1 | 1 |
NBehave | 0 | 1 |
SpecFlow | 3 | 3 |
Behat | 3 | 3 |
Binding to code
This characteristics shows how convenient is to bind text instructions to the code. When we add new text instruction we should assosiate it with some executable code. The same way if some test instruction is changed we should update the binding. Also, the binding level is another layer of code which we should support. And it actually brings some overhead while developing tests. Let's see how it works using Cucumber as an example. E.g. we have some method performing some functionality:def test_method # Here are some actions endAnd we reserved some text instruction which should call this method. Let's say, for example:
When I call test methodIn order to make proper binding we should write additional code like:
When "I call test method" do test_method endThus we have additional code level we should support as well as we have to spend extra time for writing that binding. Also, with such organization there's a risk of mixing business and system logic because binding is actually the same executable code and you can use any code there (it can be multiple method calls). Additional trouble can happen when you mix method and step definition calls or even if you try to find where each specific method is called. So, we should pay more attention to the code organization. The same problem is valid for Behat. But other engines do not have most of the above potential problems as they use annotations/attributes for binding. As the result all the time each test instruction corresponds to some specific method. Such one-to-one correspondence simplifies navigation through the code as well as supports better framework organization when we just have core code (implementing business functionality) which is marked with corresponding expressions to bind it to text instructions. So, Freshen, JBehave, NBehave, SpecFlow have serious advantage in this area in comparison to Cucumber and Behat. So, the grades can be set in the following way:
Engine | Binding to code |
---|---|
Cucumber | 2 |
Freshen | 3 |
JBehave | 3 |
NBehave | 3 |
SpecFlow | 3 |
Behat | 2 |
Formatting flexibility
Since Gherkin was designed to make tests representation human-readable the formatting plays essential role in test design. E.g. it's convenient to outline each scenario, align columns, outline tables. But some engines are still sensitive to the format which brings serious restrictions. Thus, Jbehave is sensitive for heading tabs in story files. So, you can't format this file well to make it convenient for reading. All other engines don't have such troubles. So, hope JBehave will get rid of it as well in the nearest future. But now the grades are distributed in the following way:Engine | Formatting |
---|---|
Cucumber | 3 |
Freshen | 3 |
JBehave | 1 |
NBehave | 3 |
SpecFlow | 3 |
Behat | 3 |
Built-in reports
Reporting is one of the most important part of the testing and automation as well. It's not enough to say that test is passed or failed. Very often we should identify where it fails. It's vital for functional tests where we perform sequence of steps with some interim checks and such test can fail everywhere. And we should be able to say what was the actual reason of fail. The BDD engines have some advantage in this area. Since tests are designed using natural language instructions and we have an ability to control which steps were executed (using hooks we can retrieve text instruction which is being processede now). So, in addition to error messages we can produce steps to reproduce which seriously simplifies an effort of results analysis. Behat and Cucumber even have such build-in formats. E.g. you can specify the HTML output and engines will generate informative reports like this one: Unfortunately, some other engines are usually based on unit tests engines like JUnit, NUnit and they usually produce standard report which works for unit tests but without steps information. E.g. SpecFlow works this way. Actually it generates NUnit (or MSTest if appropriate configuration is made) code and all tests are executed as NUnit tests. Similar problem is for Freshen, JBehave, NBehave. In order to have informative reports we should spend some time customizing report using hooks or similar mechanisms. However, it's doable anyway. So, let's grade our engines by current criteria with the following values:Engine | Built-in reports |
---|---|
Cucumber | 3 |
Freshen | 2 |
JBehave | 2 |
NBehave | 2 |
SpecFlow | 2 |
Behat | 3 |
Input data sources support
Sometimes there's necessity to kepp some test data or tests outside of source code repository. E.g. you can store tests in some external location where some other people can make some updates. Or you can store tests as requirement documents. Anyway, sometimes it's useful to have an ability to use such shared resources. Also, it's convenient when you can include some part of the tests into another test located in a different file. It's just minimizes copy/paste operations. E.g. you have some steps which are used as background in one feature and you want to re-use it in some other features. If you can simply include needed file that would be much faster than copying repetitive code. That would be clearly seen during the maintenance. So, how things are going in that space? Not very good actually. More or less serious support of external resources can be found for JBehave. There's an ability to use files not only from local machine but also from some specific URL (there's even support for Google Docs). Also, Freshen had an ability to use URL to reference to test data. In terms of inclusions Freshen is still good. Additionally it has an ability to specify where the steps definitions should be taken from. For other engines there's no specific information regarding those features. So, the grade table for this chapter looks like:Engine | External Data | Inclusions |
---|---|---|
Cucumber | 0 | 0 |
Freshen | 2 | 3 |
JBehave | 3 | 2 |
NBehave | 0 | 0 |
SpecFlow | 0 | 0 |
Behat | 0 | 0 |
Overview table and conclusions
Let's put together all grades collected during this article. The overview table looks like:Engine | D o c u m e n t a t i o n | Flexibility in passing parameters | A u t o - c o m p l e t e | Scoping | C o m p o s i t e s t e p s | Backgrounds and Hooks | B i n d i n g t o c o d e | F o r m a t t i n g f l e x i b i l i t y | B u i l t - i n r e p o r t s | Input Data Sources | Overall | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
R e g u l a r e x p r e s s i o n s | T a b l e s | M u l t i - l i n e i n p u t | E x t r a f e a t u r e s | T a g g i n g | S c o p e d s t e p s | B a c k g r o u n d s | H o o k s | E x t e r n a l D a t a | I n c l u s i o n s | ||||||||
Cucumber | 3 | 3 | 3 | 3 | 0 | 1 | 3 | 0 | 3 | 3 | 3 | 2 | 3 | 3 | 0 | 0 | 33 |
Freshen | 2 | 3 | 3 | 3 | 0 | 1 | 3 | 0 | 3 | 3 | 3 | 3 | 3 | 2 | 2 | 3 | 36 |
JBehave | 3 | 2 | 3 | 0 | 3 | 1 | 3 | 1 | 3 | 1 | 1 | 3 | 1 | 2 | 3 | 2 | 32 |
NBehave | 1 | 2 | 3 | 0 | 2 | 1 | 0 | 0 | 2 | 0 | 1 | 3 | 3 | 2 | 0 | 0 | 20 |
SpecFlow | 3 | 2 | 3 | 3 | 2 | 3 | 3 | 3 | 2 | 3 | 3 | 3 | 3 | 2 | 0 | 0 | 36 |
Behat | 3 | 3 | 3 | 3 | 0 | 1 | 3 | 0 | 3 | 3 | 3 | 2 | 3 | 3 | 0 | 0 | 33 |
- Engines for scripting languages have almost the same feature set with some small variations. So, they implement some canonical part of functionality and it's quite easy to migrate from one engine to another (if there's a need of it)
- JBehave and SpecFlow pay additional attention to unique features which are hardly available in other engines. That brought them additional score. Though they still have some problems in fundamental things
- Every engine has gaps. It means that all of them have a lot of stuff to grow with. Maybe some features seem to be useless for each specific engine but at the same time such features become profitable in some cases.
Great post. I was just hunting for the right BDD framework for .Net and this post helped me a lot.
ReplyDeletethere are several BDD engines in Python: Behave, Lettuce, Freshen. but only Freshen was mentioned here. do you have any comparison among them?
ReplyDeleteI think I will. This article was written quite long ago and some things were changed as well as new capabilities were discovered. So, in the future I think I'll update the comparison with wider range of engines and criteria set.
DeleteThis post was really great! Do you have any update on this?
Deleteabout feature "hooks" for jbehave, I belive this could be what you was looking for:
ReplyDeletehttp://jbehave.org/reference/stable/annotations.html
Good catch! Thank you. No idea why I didn't find it before (maybe it's something new) but these are definitely hooks. That would make the grade 2 for the feature as minimum.
Delete1. There are fee JBehave plugins for IDEA and eclipse - they support autocompletion
ReplyDelete2. What about Cucumber JVM? Starting from IDEA 12, it officially supports Cucumber-JVM stuff.
3. I think it's possible to scope tests in jbehave, just add some magic: http://java.dzone.com/articles/how-scope-scenarios-jbehave
4. JBehave has GivenStories instead of Backgrounds, not sure that backgrounds will appear in JBehave:
http://jira.codehaus.org/browse/JBEHAVE-392
> 1. There are fee JBehave plugins for IDEA and eclipse - they support autocompletion
Delete> 2. What about Cucumber JVM? Starting from IDEA 12, it officially supports Cucumber-JVM stuff.
Well, technologies are not static. They're evolving. Not sure about the IDEA 12 but for Java I'm using Eclipse and it also has quite nice editor for features which supports autocompletion.
The key thing is that IDEA 12 was released on December 2012 while this post was written 6 months before that :-) I was thinking on making a kind of annual updates on such overview (finally I'll probably do that) but such overview takes too much time to create + the scope of such tools became wider as a lot of similar engines appeared.
> 3. I think it's possible to scope tests in jbehave, just add some magic: http://java.dzone.com/articles/how-scope-scenarios-jbehave
That's why JBehave has score 1 but not 0 in this overview. It means that such feature exists but it's not something which is done straightforward. The link shows another trick how to do it. However, you can compare it with SpecFlow where there's reserved keyword for that. However, maybe this is for good. Steps scoping is something that may cause a lot of errors
> 4. JBehave has GivenStories instead of Backgrounds, not sure that backgrounds will appear in JBehave:
http://jira.codehaus.org/browse/JBEHAVE-392
GivenStories looks more like tests dependency rather than typical pre-condition which backgrounds are. But yes, it can be used in similar fashion. That's why JBehave scored 1 on Backgrounds feature (not 0)
Currently Freshen for Python isn't good choice. It's tightly integrated with nose test runner and can't be run without it. Better choices are Lettuce, Behave and Morelia. I wrote on Python's BDD tools on my blog:
ReplyDeletehttp://stolarscy.com/dryobates/2015-04/bdd_tools_in_python/
I still search for JavaScript BDD-tool written in the Cucumber's vein. Can you recommend any such tool?
As for JavaScript you can take a look at this: https://cukes.info/docs/reference/javascript
Deleteor corresponding GitHub project: https://github.com/cucumber/cucumber-js
Maybe there is something else but these guys are being developed by the same community as original Cucumber