Evgeny Rovinsky, DAO.Casino Head of QA wrote a well-tailored article for our readers. Enjoy the read.
Let DAO.Casino tell you about our approach to end-to-end testing and how we use Computer Vision to test our applications. As we always set high standards, we create tests and carry them regularly, along with the platform, to ensure that entire software works smoothly. We can be certain that all the casino components perform as expected (including the wallet, web platform and integration with the Ethereum blockchain). For the best results, we simulate a player’s behavior.
And as UI tests are at the top of the 'testing triangle', at DAO.Casino we naturally have the best ones. Let us explain how it all works.
A standard web application has a set of DOM elements that a very common testing framework can interact with. Not all applications include them though. Those missing the DOM elements usually relate to web games that often used to be Flash applications, and nowadays use HTML5. Usually, a web browser (and Selenium that uses it to interact with web apps) sees the entire game as one DOM element – eg canvas. The full application, which includes vivid graphics and complicated logic, is one web element, but one we interact with to discover what happens. So what do we do next?
How Computer Vision works
Computer Vision is the answer. And Python Bindings allows us to use its excellent implementation - OpenCV. It helps 'see' objects inside a picture. So we transfer what the browser is displaying into the screenshots and look for the elements we need.
OpenCV allows us to find the coordinates of the best 'match' of the template picture on the app image. The first step is to interact with the elements found, the most common is to click on an element. Next up, is 'drawing' elements from one place of the screenshots to another (referred to as drag and drop). So, it is necessary to simulate a mouse click, double click, mouse hold-and-release and then mouse move. All of these actions are available at ActionChains Python lib.
How OpenCV helps recognize elements
First of all, we create templates which can be prepared using a browser. With this in place, it is possible to find matches of the template on the picture of the whole canvas element.
OpenCV provides matchTemplate method, but to use it, we first need to convert the template and the picture to grayscale. After that, we can use the matchTemplate method.
iframe_gray = cv2.cvtColor(iframe, cv2.COLOR_RGB2GRAY)
template_gray = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY) c
match = cv2.matchTemplate(iframe_gray, template_gray, cv2.TM_CCOEFF_NORMED)
So, the match is a matrix of iframe size. On each original iframe ‘pixel’, we have the information about how much the template matches the underlying image (ranging from 0 to 1).
Now that we have this matrix, we can use it in two ways:
- If we need to find one best match (eg. we are looking for one button on the original picture). The button’s coordinates would be used.
- If we need to find multiple similar objects. The answer would be a list of all the matching coordinates.
Finding one best match
In order to find this one best match, we call on the minMaxLoc function to obtain the coordinates we are looking for. The only thing missing are “false positive” matches. We sift through them and give a negative response to matches that are not good enough.
After calling img_match = cv2.minMaxLoc(match), we have the
as a measure of grayscale match quality. The threshold 0.9 is usually enough to know that the match is good. But in some cases it is not enough. A good ‘color’ match is required to sift out similar but wrong colored elements. We calculate the color histograms and compare them:
scr_hist = cv2.calcHist([scr_crop], [0, 1, 2], None,[8, 8, 8], [0, 256, 0, 256, 0, 256])
img_hist = cv2.calcHist([self.img], [0, 1, 2], None,[8, 8, 8], [0, 256, 0, 256, 0, 256])
compare_color_hist = cv2.compareHist(img_hist, scr_hist, cv2.HISTCMP_CORREL)
Our experiments show that the following thresholds work better:
if compare_shape >= 0.9 or (compare_shape >= 0.8 and compare_color_hist >= 0.35):
A very good shape match, or both shape and color match, are sufficient to assure that the images are the same. Once completed, we can interact with the element discovered.
On a picture, you can see the template, as well as the best place where the template was found.
As long as we have the match matrix as in the previous section, we can use the numpy.where function to have all matching elements.
threshold = 0.9
loc = numpy.where(match >= threshold)
match_list = 
for pt in zip(*loc[::-1]):
match_list contains the list of all the matching coordinates found. Note that there could be several adjacent pixels for the same image match because all of them have a likeness over the threshold. In this case, match_list would contain several similar elements for the same real match. This works for us but if this does not suit you, it is possible to make one more pass over the match_list elements and remove all the adjacent ones.
On a picture, you can see the template (one template chest) and four places where the template was found.
‘ActionChains’ helps to interact with elements
Once the coordinates of the element are calculated, it is possible to interact with them. In previous sections, we have calculated the left top corners of an element. As we know the size of the element, we can calculate the coordinates of its center.
Using the ActionChain package we can then:
- Click it using click method to the element center
- Move it using drag_and_drop_by_offset method
The disadvantage of the algorithm is that it is extremely sensitive to the picture scale. If the scale of the original changes, the template will not be recognized. And it becomes difficult to match a template after non-affine transformation. The game scale could easily change if you use a different browser. If the surrounding elements of the canvas change, that canvas is displayed on a different scale.
The solution could be as follows:
- If there is a set of resolutions to be tested, you need to have a set of templates, so each template covers each resolution.
- If you expect a slight possible scale change, you can set a loop of the template matching, say, from -10% to +10% with a 2% step. A 2% difference is fine for matching algorithms. As a result, it becomes at least X10 slower.
Text recognition can help find some buttons and captions which could simplify the recognition. There are python libraries implementing Optical Text Recognition (OCR) expanding OpenCV’s functionality: tesseract, PIL. That is why we are moving forward using OCR.
Follow us on Twitter, Facebook, and our official site to keep up to date with all of the latest announcements. Join the discussion in our main Telegram channel or the chat for developers and give us your feedback on the SDK.