Effective Data Migration Testing

When considering the term software quality assurance and testing, what comes to mind? For me, I think of developing test cases that exercise program functionality and aim to expose flaws in the implementation. In my mind, this type of testing comes mainly before a piece of software is released, and often occurs alongside development. After the product is released, the goals and focus of software quality assurance and testing change.

My views were challenged, however, when I recently came across an interesting new take on software testing. The post by Nandini and Gayathri titled “Data Migration Testing Tutorial: A Complete Guide” provides helpful advice and a process to follow when testing the migration of data. These experienced testers draw on their experiences to point out specific places in the migration of software where errors are likely to occur, and effective methods of exposing these flaws before they impact end-users and the reputation of the company.

The main point that Nandini and Gayathri stress is that there are three phases of testing in data migration. The first phase of testing is pre-migration testing, which occurs, as the name would suggest, before migration occurs. In this phase, the legacy state of the data is observed and provides a baseline to which the new system can than be compared to. During this phase, differences between the legacy application and the new application are also noted. Methods of dealing with these differences in implementation are developed and implemented, to ensure a smooth transmission of data.

The second phase is the migration testing phase, where a migration guide is followed to ensure that all of the necessary tasks are performed in order to accurately migrate the data from the legacy application to the new application. The first step of the phase is to create a backup of the data, which can be relied upon in case of disaster as a rollback point. Also during this phase metrics including downtime, migration time, time to complete n transfers, and other relevant information are recorded to later evaluate the success of the migration.

The final phase of data migration testing occurs post-migration. During this phase, many of the tests that are used can be automated in nature. These tests compare the data from the legacy application to the data in the new application, and alerts testers to any abnormalities or inconsistencies in the data. The tutorial lists 24 categories of post-migration tests that should be completed satisfactorily in order to say that migration was successful.

Reading this tutorial on data migration testing has certainly changed my views on what testing means. The actual definition seems much broader than what I would had thought in the past. Seeing testing from the perspective of migrating applications gave me insight on the capabilities of and responsibilities placed on software testers. If something in the migration does not go according to plan, it may be easy to place blame on the testers for not considering that case. I enjoyed reading about software testing from this new perspective and learning some of the most important things to consider when performing data migration testing.

The Dangers of Relying on Automated Testing

After listening to Jean Ann Harrison’s discussion about how important critical thinking is in the context of software testing and quality assurance on an episode of Test Talks, I wrote a post about The Limits of Automated Testing. Although Harrison’s explanation was great, I had a few remaining questions and this week chose to look for more information on automated testing. I came across a post by Martin Jansson from March 2017 titled Implication of emphasis on automation in CI, and it seemed to provide me with the more comprehensive view of testing automation that I was looking for.

Jansson starts out on a positive note, stating that he “less frequently see[s] the argumentation that testing is not needed.” To me it is almost comical to think about someone arguing that testing is unnecessary. While I completely understand that managers and executives are enticed by the possibility of saving time and money by not testing software, this is an extremely risky and careless method of creating a product. I doubt that anyone releasing untested software lasts very long or makes any money in the industry.

So if not testing at all is not an option, what are the options? Going with the bare-minimum for testing would be running only automated tests, a method that Jansson says is actually used. I have to agree with Jansson, however, when he says that this is not testing, rather it is simply checking. Instead of exploring parts of the code that are likely to contain bugs, you will simply be checking acceptance criteria. By not exploring the code fully, you are failing to find anything that might be outside the scope of the specification or the requirements. I feel that the following graphic provides an excellent representation of how few tests are actually performed when following a testing strategy that relies solely on automation.

(Source: http://thetesteye.com/blog/2017/03/implication-of-emphasis-on-automation-in-ci/)

What constitutes the perfect blending of automated and manual testing may be impossible to know. What is certain, however, is that automated testing cannot be relied upon as the sole method for testing. Jansson puts it in layman’s terms when he says that “you rarely automate serendipity.” Just as Jean Ann Harrison points out in the Test Talks podcast mentioned earlier, automation is not and will never be a replacement for thought. It is a bit of a relief to know that the software development companies are maturing and beginning to understand the importance of having testers who use a combination of automated and manual testing. As long as there continues to be humans writing code, there will need to be humans who test that code.

Predictive Applications and the ‘Datafication’ of Everything

We live in a world where we are constantly being bombarded with information. Not only do we consume insane amounts of data, we are also providing other people and businesses with information about ourselves. Signing up for online mailing lists, ordering magazine subscriptions, and even making dinner reservations, information about our habits and preferences is constantly being left behind, a concept that Charlie Berger refers to as data exhaust in a podcast from October 10, 2017 on Software Engineering Radio. The larger concept that he is describing is what is known as ‘datafication’, a buzz-word in the data science and big data spheres that refers to the collecting and storing information about social actions that can be used to perform predictive analyses and targeted marketing.

Specific to the computer science discipline, datafication has implications on the development of predictive applications. In the podcast episode, Berger presents the simple yet extremely effective example of an ATM machine as lacking in the predictive application sense. Berger wonders why each time that he uses the ATM he is asked which language he would like to use, and why such preferences are not somehow tracked and stored, making for a more seamless and personalized ATM experience. Berger even suggests that the ATM track more than language preferences, offering withdrawal suggestions based on previous transaction data from a similar day of the week or time of the day.

While it may not be terribly inconvenient to have to choose a language each time you use the ATM, the concept of predictive applications and the advantages associated with creating and using these types of applications becomes much more apparent when considering larger-scale operations. Retailers can use predictive applications to make important decisions about things like advertising and merchandising. Berger mentions the well-known “parable of the beer and diapers,” where an interesting and entirely unexpected correlation was found between purchases of diapers and beer. While some versions of the tale include the retailer moving the two correlated items next to one another in order to drive increases in sales, this may or may not be factual. Regardless, such examples of generating useful information based on querying data is a perfect example of the power the predictive applications have.

Berger repeatedly stresses the importance of moving the algorithm to the data, not vice-versa. By moving the algorithm to the data, we avoid all of the dangers of bypassing security and encryption. Developing applications that perform queries and compile information that is usable and useful to not only data scientists, but normal people as well, is a perfect example of how machine learning and predictive applications can make everyones jobs easier.

As a student, I took one of Berger’s closing remarks under careful consideration. Berger states that it is much easier for a programmer to learn how to make a program that interprets data than for a data scientist to translate his specific, one-off analyses into programs. With a newfound understanding of why predictive applications are so important to our data-obsessed society, I look forward to exploring how I can begin developing applications that take advantage of machine learning.

The Limits of Automated Testing

Automated testing is great, and it isn’t going anywhere. The ability to find bugs in a program with minimal human intervention saves both time and money, as well as helps to make the program more reliable. The problem with automated testing is that the automation can not think like a human. The automated tests simply follow the algorithm that they were programmed to follow. This may be great for finding simple bugs or obvious faults, but may be insufficient for revealing complex or hidden bugs.

On the September 10, 2017 episode of Test Talks, Jean Ann Harrison argues that what we need in order to find these complex, hidden bugs, is critical thinking. As an experienced tester who has worked in the industry for nearly 20 years as everything from a mobile tester to a quality assurance auditor, Harrison knows a fair deal about finding bugs in software. As a medical device tester, Jean Ann states that she often considered not was the product was designed to do, but what it was capable of. When it is a matter of life or death, there is no room for crippling bugs to make it into the final product.

I think that Harrison took many of her experiences as a medical device tester into her current position as a quality assurance auditor of airline entertainment software. In addition to the strict FAA requirements that she must adhere to, once again there are lives at risk if there are bugs that go unnoticed and unaddressed. When considering possible scenarios to test the product under, Harrison repeatedly states that she uses critical thinking skills to think outside of the box; she is always asking herself “what if…?” She states that asking these sorts of questions, along with imagining the possible scenarios in which the product would be used, will lead to the development of meaning tests and possibly reveal bugs.

Strictly following the testing methods that I’ve learned as a Software QA & Testing student so far seems to keep me inside a bubble. I am only able to test what the method states should be tested. This is sometimes difficult because my mind has a tendency to think of all of the possibilities, much like what Harrison is advocating for testers to do. I want to stray from the strictly defined values that the method demands I input and attempt to use my experience as an end-user and also my experience as a programmer to attempt to break the program. Of course, in the context of testing, breaking a program is a success. It means that you have found a bug and that the finished product will be that much better. I look forward to applying Harrison’s critical thinking strategy to my testing in the future. I am excited to investigate “what if…?” and hopefully make programs better by discovering bugs that would have otherwise gone unchecked.

The Place For Tools in Development

Especially for new or inexperienced programmers, tools can be a great way to help get the ball rolling or learn how to create programs that work. Too often, however, programmers rely on their tools to think for them, a dangerous and often damaging decision. A post by Robert Martin on his Clean Coder Blog titled “Tools are not the Answer,” explains potential causes of the impending “software apocalypse” and also points out some common mistakes that developers should avoid. Martin acknowledges the value of tools and technologies such as Light Table, but feels that such tools are not going to solve the apocalypse. Tools only further complicate things rather than addressing the underlying cause, which Martin cites as software programmers being generally undisciplined.

Rather than trying to fix bad code with more code, Martin thinks that we should simply aim for more disciplined programming. The reasons he gives for the cause of the apocalypse are:

  1. Too many programmer take sloppy short-cuts under schedule pressure.
  2. Too many other programmers think it’s fine, and provide cover.

I feel that Martin’s first reason is more significant than the second. While often times deadlines are outside of the programmer’s control, the choice to take a short-cut that jeopardizes the integrity of the code is a conscious choice. Avoiding this dangerous mistake may require extending deadlines or missing them altogether. Weighing the risks of releasing an inferior product with delivering it past its original deadline may depend on the product’s application. Reputations would certainly be more severely impacted by the former, while the latter may cause only minor inconvenience to the end-user.

I don’t see the second reason Martin states as so much of a problem. I would argue that other, more experienced programmers should help to implement the feature properly rather than allowing an overwhelmed programmer to sloppily stumble through a buggy implementation. Martin seems to think that tattling on the sloppy programmer is the solution to making sure that he pays for his carelessness. I think that in any team-driven environment, colleagues should have one another’s backs and everyone should be accountable.

While I stand behind Martin’s opinion that the real reason behind the impending software apocalypse is a lack of general discipline among programmers, I only partly agree with the causes he proposes for this lack of discipline. I think that more importantly than anything else, the programmer must consider the risk he or she is taking by rushing through something without proper and rigorous testing. Some of the examples of software bugs that caused panic and chaos are found in “The Coming Software Apocalypse,” which is the article that Martin continuously refers to in his own blog post. While the code that I am presently writing does not have any real-world consequences (apart from a poor grade if it does not meet the requirements of the assignment), I am challenging myself to write code as if someone’s life depended on the reliability of what I write. Who knows, someday it just might.

Making Testing More Effective Through Increased Testability

While there are few people who would would argue that testing is easy, it also should not be prohibitively difficult. The difficult part in testing software should be in deciding what to test, not how to test it. In a post from late September 2017, Michael Bolton describes how important testability is in the creation of a stable project with less risk of bugs at the time of delivery. If releasing a project untested or with insufficient depth of testing sounds risky to you, you are in good company. Making testing easier, known as increasing testability, allows for more thorough testing and (hopefully) a more polished, bug-free finished product.

Bolton describes testability in terms of visibility and controllability. The examples that he gives for visibility are log files and continuous monitoring. For controllability, Bolton cites application programming interfaces or APIs as the most common method for the easy manipulation of the product. An important takeaway from the post is that while it is certainly helpful to the tester if a product has things like log files and an API, this is not all that testability encompasses. Bolton presents the idea of testability as a set of relationships between multiple elements in the design process including the product, the tester, the development team, and the development environment. The overall testability of a product is a result of the complex interactions between all of these people and things involved.

The first category that Bolton mentions is epistemic testability. It is impossible for a tester to know all of the bugs in the code before performing any testing. If this were the case, there would be no need for software testers at all. The act of testing explores what Bolton calls a “risk gap,” or the areas in the project that the tester is uncertain about or unfamiliar with. Next, Bolton considers value-related testability, which refers to knowledge of what the intended user of a program is looking to gain. Understanding what is valuable to others allows a tester to focus his or her efforts where it will have the most significant impact. Intrinsic testability refers to the product’s ability to be easily understood by the tester. If a program’s behavior is easy to follow and its state is transparent to the tester, he or she will have a far easier time properly testing it. Since most projects are assigned to teams of people in many different positions, with different tools and knowledge, access to these people and resources is essential for project-related testability. Finally, subjective testability refers to the skills of the tester or testing teams matches the requirements of the project.

Bolton’s more literary definitions of testing were a welcome change from the testing material that I’ve read online and in text. Bolton seems to focus more on the people and the environment that the testing is being conducted in rather than on what specific tests are used. I think that as a student, many of the points that he makes are important to carry with me into any potential professional positions. Evaluating the testability of products through Bolton’s methods will allow me to better manage risk and deliver products with fewer bugs.

Turning the Big Ball of Mud into Modular Code

What Konrad Gadzinowski describes in the opening paragraphs of his post on “Creating Truly Modular Code with No Dependencies,” the “emotional rollercoaster” of developing software, is something that I’m sure anyone who has ever written a program has experienced. I certainly encounter this each time that I’m writing code for a project, whether it is an academic project or a professional one. Eager to begin a project, I often dive in and begin completing the simpler parts first. During this time, it seems that progress moves very quickly. After all of these easy, simple pieces are done, however, progress seems to slow or stall. As the requirements become more complex, I often find myself going back to previous code and rewriting things so that they integrate more seamlessly with the new element that I am adding. This problem is what Gadzinowski describes as the “big ball of mud.” Gadzinowski provides Apache Hadoop as an example of a program with the ball of mud interdependencies that slows further development and makes tracing the source of bugs more difficult. In the image below, each class is represented as a point on the outside of the circle, and the lines between the points are representative of a dependency.

(Image source: https://www.toptal.com/software/creating-modular-code-with-no-dependencies)

With so many interdependent classes, I imagine that untangling the web to trace bugs in Apache Hadoop would be a nightmarish task. Gadzinowski offers a solution to the problem of the ball of mud, however, that seems like sound advice. His suggestion is to use the element design pattern when developing software. This modular pattern aims to create reusable pieces of code that are independent of other classes. This is done through the use of element classes and element listener interfaces. In this way, all of the required dependencies for an element are encapsulated within that element. Outside classes that wish to utilize the element are not concerned with the underlying design of the element, they interact with the element’s listener. Gadzinowski presents this as a way to increase the flexibility of the element, allowing it to, for example, output to any number of different external environments through an identical listener call.

While I was immediately willing to listen to the post’s advice after it described a miserable situation that I’ve encountered countless times, I think that reading Gadzinowski’s explanations and examples of the element design pattern has certainly made me a believer. I think that what makes him so credible is his willingness to acknowledge the value in initially jumping into design without worrying too much about the big ball of mud that you may be creating. While this may not be the solution for a final release, it can get the ball rolling and allow for the element pattern to make your code more reusable and stable for production releases later on. I will keep Gadzinowski’s advice in mind the next time that I begin to worry that I have too many interdependent classes to make my classes reusable or easily maintainable.

Which Came First… The Test or the Code?

In episode 31 of a podcast by Brian Okken on his show titled “Test and Code,” Brian has a discussion with guest Paul Merrill about the testing pyramid and why he is frustrated with certain test-driven development (TDD) models. The discussion began as a Twitter disagreement between the two test enthusiasts and blossomed into a full-fledged “civil discussion.” While both Brian and Paul agree about the importance of testing and see the value in test-driven development, they disagree about how extensive testing needs to be in order to be effective. The two also disagree about how tests should be written and to what extent code should be based on testing versus tests written based on code. Although they do not seem to ever reach a consensus on the issue, each of them make some excellent points and give examples of personal experiences to support their opinions.

Brian, for example, is fed up with the sheer number of redundant unit tests that are written to test the same thing in traditional pyramid testing. Brian presents an example of a hypothetical method that has a test written for handling negative numbers, but the higher-level method that passes values to the first method will never pass a negative value. Brian sees testing the first method’s ability to handle negative numbers as unnecessary and a waste of time. If these same types of tests are written for hundreds of methods, Brian argues, an extraordinary amount of time is wasted on useless testing. As a rebuttal, Paul argues that if changes are made to the higher-level method that allows negatives to be passed down, the tests would already be written. Clearly not convinced, Brian scoffs at this justification for the test he clearly sees no value in.

Paul seems to have some sort of personal experience in all of the obscure areas that Brian dismisses as rare or unusual scenarios. Paul seems to support a bottom-up test-driven development platform, where tests are written that outline how every last detail that a program will perform. Paul argues that this is the best way for tests to effectively drive development. He seems to think that tests are not able to aid in the development process if they are written after development has taken place. Brian, on the other hand, sees this issue from a top-down perspective. He argues that higher, user-level tests should be written first. When the high-level tests are insufficient to further drive development or are too ambiguous to write code for, then lower-level unit tests should be written.

It was extremely interesting to listen to two experienced testers discuss such a controversial topic. While I can see where both men are coming from with their opinions, I seem to lean towards Brian’s side in the argument over test-driven development. I think that the points he makes about a pragmatic approach to testing is important. Rather than generating some ridiculous number of unit tests that may not have any bearing on the actual functioning of the program, the effort should be put into doing what was originally promised by the program specification. I think that I will follow Brian’s advice and also take the pragmatic approach to writing tests, in the hopes of avoiding rewriting tests and code. In the end, its not about how many tests can be written, its about testing for the correct things in order to deliver a bug-free program.

Is ‘Agile’ really agile?

The Agile software development methodology is based on the “Manifesto for Agile Software Development,” which outlines the values and goals of the platform. For many software development teams, an Agile methodology has replaced the dated Waterfall method. I think that the diagram below does an excellent job of highlighting the key differences between the two methodologies.

(Image source: https://www.seguetech.com/waterfall-vs-agile-methodology/)

The Agile method allows developers more flexibility and involvement in some of the stages of the development that were previously dominated by managers and other higher-ups with no connection to the code itself. In cases where getting a working prototype of a project deployed quickly is of primary importance, the Agile method is the clear choice. In Agile development, responding to changes in the program specification can be done relatively simply through regular meetings and discussions of progress.

The more traditional Waterfall methodology follows a linear sequencing, where each step must be completed in order before the next step is begun. This means that there is often a longer period of development before any product is ready to be deployed. When the product is deployed, however, it will often be more polished and complete. The Waterfall methodology does not respond well to changes in the specification, as this will often require backing up in the process and then reworking each of the steps.

Now, with a general idea of the two methodologies, I could begin to understand where user ayasin is coming from in his rather intense post titled, “Agile Is The New Waterfall.” The post on Medium.com generated quite the buzz of controversy, and even attracted the attention of well-know computer science figures including Uncle Bob. In his post, ayasin argues that Agile has become the tiresome, outdated successor of Waterfall. While he does not offer any solutions, he sure presents a lot of problems with Agile. Ayasin describes the Agile development process as follows, “You just throw stuff together as quickly as possible because you know it’s mostly trash anyway.” This hardly seems like a way to produce quality software. What’s more, ayasin argues, is that more of the responsibility (and potentially blame) is placed on the developers themselves, as they are given the illusion of involvement in the process without any real control of the outcome.

Before finding ayasin’s post on Medium.com, I had a vague idea of the Waterfall and Agile methodologies. After a bit of research of the two strategies, the post seems to make some excellent points. While I agree with some of them, I’m not sure if ayasin is being a bit harsh on Agile. It would seem that when properly implemented and followed, the Agile methodology has significant advantages over the traditional Waterfall method. Reading about the two methods has given me insight into some of the challenges I can expect to face when working on a project in the future. I feel nervous but prepared for these potential challenges and look forward to someday working on projects like the ones described in my research.