Sikuli Tutorial for Selenium Automation

What is Sikuli in Selenium?

Sikuli is an open-source GUI based test automation tool. It is mainly used for interacting with elements of web pages and handling windows based popups. Sikuli uses the technique of “Image Recognition” and “Control GUI” to interact with elements of web pages and windows popups. In Sikuli, all the web elements are taken as images and stored inside the project.


How to use Sikuli with Selenium Webdriver

Sikuli can be integrated with selenium webdriver using the Sikuli JAR file.

The below sequence is the list of steps to configure Sikuli with selenium webdriver.

Step 1) Download the Sikuli JAR file from the below URL, and extract the contents of the ZIP file to a folder.

Sikuli Zar File

Step 2) Create a new JAVA project in Eclipse and add the JAR file to build path, along with selenium jar files using Right Click on the project -> Build Path -> Configure Build Path

Once you have added the JAR file to project build path, classes provided by Sikuli can be used.

Screen class in Sikuli

Screen class is the base classes for all the methods provided by Sikuli. Screen class contains predefined methods for all the commonly performed operations on screen elements such as click, double-click, providing input to a text box, hover, etc. The below is the list of commonly used methods provided by Screen class.

Method Description Syntax
Click This method is used to click on an element on the screen using image name as the parameter. Screen s = new Screen();“QA.png”);

doubleClick This method is used to double click on an element. It accepts image name as the parameter. Screen s = new Screen();


Type This method is used to provide input value to an element. It accepts the image name and text to be sent as parameters.


Hover This method is used to hover over an element. It accepts image name as the parameter.


Find This method is used to find a specific element on the screen. It accepts image name as the parameter.


Pattern class in Sikuli

Pattern class is used to associate the image file with additional attributes to uniquely identify the element. It takes the path of the image as a parameter.

Pattern p = new Pattern(“Path of image”);

The following are the most commonly used methods of Pattern class.

Method Description Syntax
getFileName Returns the file name contained in the Pattern object.

Pattern p = new Pattern(“D:\Demo\QA.png”);

String filename = p.getFileName();

similar This method returns a new Pattern object with similarity set to a specified value. It accepts the similarity value between 0 to 1 as a parameter. Sikuli looks for all elements that fall within the specified similarity range and returns a new pattern object.

Pattern p1 = p.similar(0.7f);

Exact This method returns a new pattern object with similarity set to 1. It looks only for an exact match of the specified element.

Pattern p1 = p.exact();

Code Example for File Upload using Sikuli

Below code explains the use of Sikuli for file upload in Firefox.

package com.sikuli.demo;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.sikuli.script.FindFailed;
import org.sikuli.script.Pattern;
import org.sikuli.script.Screen;

public class SikuliDemo {

    public static void main(String[] args) throws FindFailed {

        System.setProperty("", "D:\\chromedriver.exe");
        String filepath = "D:\\Guru99Demo\\Files\\";
        String inputFilePath = "D:\\Guru99Demo\\Files\\";
        Screen s = new Screen();
        Pattern fileInputTextBox = new Pattern(filepath + "FileTextBox.PNG");
        Pattern openButton = new Pattern(filepath + "OpenButton.PNG");
        WebDriver driver;

        // Open Chrome browser    
        driver = new ChromeDriver();

        // Click on Browse button and handle windows pop up using Sikuli
        s.wait(fileInputTextBox, 20);
        s.type(fileInputTextBox, inputFilePath + "Test.docx");;

        // Close the browser

Code Explanation:

Step 1) The first statement involves setting the driver executable path for chrome.

System.setProperty("", "D:\\ chromedriver.exe");

Step 2) Use a screengrab tool such as Snipping Tool to take screenshots of windows popup ‘FileTextBox’ and ‘Open’ button.

This is how your screenshot should look like:-

Images for windows file input text box and open button are stored onto ‘FileTextBox.PNG’ and ‘OpenButton.PNG’.

Sikuli uses the technique of Image Recognition to recognize elements on the screen. It finds elements on screen solely based on their images.

Example: If you want to automate the operation of opening notepad, then you need to store the image of a desktop icon for notepad onto a PNG file and perform click operation on it.

In our case, it recognizes the file input text box and opens button on Windows popup using the images stored. If the screen resolution changes from image capture to test script execution, the behavior of Sikuli would be inconsistent. Hence it is always advisable to run the test script on the same resolution at which images are captured. Change in pixel size of images will result in Sikuli throwing a FindFailed exception.

Step 3) The next statements include the creation of objects for Screen and Pattern classes. Create a new screen object. Set the path of the file you want to upload as a parameter to the Pattern object.

Screen s = new Screen();
Pattern fileInputTextBox = new Pattern(filepath + "FileTextBox.PNG");
Pattern openButton = new Pattern(filepath + "OpenButton.PNG");

Step 4) The below statements involve opening chrome browser with the URL:

driver = new ChromeDriver();

The above URL is a demo application to demonstrate file upload functionality.

Step 5) Click the choose file button using below statement


Step 6) Wait for the windows popup to appear. Wait method is used to handle the delay associated with opening windows pop up after clicking on the browse button.

s.wait(fileInputTextBox, 20);

Step 7) Type the file path onto input file text box and click on Open button

s.type(fileInputTextBox, inputFilePath + "Test.docx");;

Step 8) Close the browser



Initially, script opens chrome browser

Clicks on the ‘Choose File’ button, windows file popup screen will appear. Enters data into File Input textbox and clicks on ‘Open’ button

Below screen is displayed once the file upload is complete and closes the browser


Sikuli is used to handle flash objects on a web page and windows popups with ease. Sikuli is best used when the elements on user interface do not change frequently. Owing to this disadvantage, from an automation testing perspective, Sikuli is given less preference compared to other frameworks such as Robot and AutoIT.