At our company, we encourage our employees to write blogs on test and technical topics whenever they can, to share their knowledge and experience. I am one of the reviewers of these articles, so I get an opportunity to read a diverse set of topics written by people of varying experiences. I read one on “Effective Test Data Preparation” this morning which re-kindled my passion for this space forcing me to share my additional thoughts over and above what the author had already penned.
The author, Vivek, in his write up talks about the need for effective test data, re-using test data over time ensuring they have not been corrupted and how test data should take into account: positive, negative, blank and boundary values to ensure they adequately test the application. All valid points that the author makes.
In addition to all of these, in my opinion it is very important to focus on your test data generation, depending on the type of testing undertaken. If you see, for functional testing you want a variety of test data covering all the points Vivek has mentioned above. For performance testing too, the same will hold good, but when you scale and increase your load, you would want to ensure that you maintain diversity in your data even when you deal with volumes. For security testing, focus more on the negative data sets or rather illegal ones which you would not expect your regular users to use. Scripts, SQL statements are largely used here like how a hacker would, to gain access to the internals of the application. In case of localization testing, the tester is focusing on creating real time locale specific inputs for which he might have to use translator tools. The tester will need to pay special attention around the currency, date/time formats, text length, etc. for the test data that he creates here. Prior to localization testing, when globalization and pseudo localization testing are done; the tester is focused more on creating gibberish or garbage data which is non-English to start off with. In cases of accessibility and usability testing, the tester is more focused on using tools like screen readers, narrators rather than focusing a lot on actual test data. For data base testing scenarios, you may not only need test data but also have them populated into the database, for ease of access. So, if you try to see a pattern here, you are really customizing your test data depending on the type of testing you undertake with a goal of finding as many bugs as possible. Since test data creation takes a lot of time, it is a good idea to use tools wherever possible with ofcourse due diligence done by the tester to ensure the validity of such automated test data that is created. Especially for scenarios such as user account and password creations where typically a huge volume of data is needed, a lot of freeware tools are available.
For every test pass, it would help to take care of the following checklist:
1. Understand the focus of the test pass and decide what tests are required
2. Evaluate to see what existing data can be reused
3. Ensure validity of such existing data
4. Analyze gaps for any new data that needs to be created
5. Analyze the data sources (whether data exists in XMLs, local files, data bases, reside in another application etc.) and see if any migrations are needed
Like test case maintenance effort, test data maintenance is a good practice to take on between test passes. The tester can look at any new tools in the market that can be used, any test data clean up that is needed, any new data that can be created, any user inputs from the field / bugs they have filed for which test data is very unique that you can add it to your data repertoire etc. Such pro-active tasks go a long way in saving you time and effort in your subsequent test passes and more importantly help you find realistic and good bugs. At certain times, you as a tester will also be required to share your test data with other teams in the product group including your development team to help them reproduce issues. Similarly when bug bashes are held, to save time for everyone, along with the basic instructions some test data might also come in handy for the group to use. That said, you need to draw your balance here in how much test data you share with your team, because the test data goes a long way in finding some core bugs; so you would not want to impact the team’s creativity and limit the bugs they find by sharing all of your test data.
So, chalk yourself a plan on how and what kind of data you want to generate, use and share with a goal of enhancing your product quality and containing defects as much as you can before product release. Happy testing!