How to use Selenium for Driving the Chrome Browser

“Indifference and neglect often do much more damage than outright dislike.” ― J.K. Rowling, Harry Potter and the Order of the Phoenix

Contents

1. Introduction
2. Installing Selenium
3. Java Modules
4. Getting Started with Selenium
5. Adding a Javascript Interface
6. Download It
Conclusion
- See Also

1. Introduction

Selenium is a handy tool for controlling a web browser (such as Google Chrome) through a program. It allows you to automate tasks on a website by using the browser much the same way a human user would. Such automation can be useful for a variety of tasks including: regression testing of web applications, data extraction and more. In this article, we will learn how to use Selenium to drive the Chrome Browser.

2. Installing Selenium

Selenium consists of two components: a Web Driver program and a java module for driving the Web Driver. The Web Driver program can accept commands from the java module and control the web browser. Let us install the Web Driver for Chrome program from the Chrome web site – click here and download the latest release for your platform (one of Windows, Linux or Mac OS).

For Windows, the download includes a program called chromedriver.exe which you can extract to a convenient location such as: C:\webdrivers\chromedriver.exe.

3. Java Modules

To install the required java modules, we use the following Maven dependency which takes care of installing all the necessary dependencies.

<dependency>
  <groupId>org.seleniumhq.selenium</groupId>
  <artifactId>selenium-java</artifactId>
  <version>3.8.1</version>
</dependency>

To build the software to include the required jars in the output jar, we use the following plugin in pom.xml.

<plugin>
  <artifactId>maven-assembly-plugin</artifactId>
  <configuration>
    <descriptorRefs>
      <descriptorRef>jar-with-dependencies</descriptorRef>
    </descriptorRefs>
    <archive>
      <manifest>
        <mainClass>sample.sample1</mainClass>
      </manifest>
    </archive>
  </configuration>
  <executions>
    <execution>
      <id>make-my-jar-with-dependencies</id>
      <phase>package</phase>
      <goals>
        <goal>single</goal>
      </goals>
    </execution>
  </executions>
</plugin>

4. Getting Started with Selenium

Let us write a java program to perform an IMDb search.

First step is to tell Selenium where the chromedriver.exe can be found. This is done by setting a property.

System.setProperty("webdriver.chrome.driver", "c:\\webdrivers\\chromedriver.exe");

Next, we can create the ChromeDriver object.

WebDriver driver = new ChromeDriver();

Navigate to the required URL using the get() method.

driver.get("https://www.imdb.com");

Once the browser is loaded and the get() method is invoked, we need to wait for the page to load fully before performing further actions. We wait for the page to load full as follows. This snippet waits till the title of the page starts with “IMDb”.

new WebDriverWait(driver, 10).until(d -> d.getTitle().startsWith("IMDb"));

Next, we search for the text input element where the search text must be entered. This is done using a CSS Selector as follows:

WebElement e = driver.findElement(By.cssSelector("#navbar-query"));

You can look up the element selection details using the Inspector in the Chrome Browser.

Now, we enter the search term to search for.

String movieName = ...;
e.sendKeys(movieName);

How about clicking the search button to actually perform the search? Find the WebElement and click it! Again, look up the button selector using the Google Chrome Inspector (also known as the Developer Console).

WebElement e = driver.findElement(By.cssSelector("#navbar-submit-button"));
e.click();

The search is now submitted to the IMDb site. Wait for the page to load completely before proceeding further. This time, we look for an element in the page DOM (Document Object Model).

new WebDriverWait(driver, 10)
    .until(d -> d.findElements(By.cssSelector("h1.findHeader")));

Now the IMDb search results page is completely loaded. You can now proceed further looking up elements, entering text or clicking buttons and URLs.

Once you are done with the session, you can quit the Chrome Browser by doing:

driver.quit();

And that is a simple java-selenium session for driving the Chrome Browser.

5. Adding a Javascript Interface

While the above java program is quite easy to write, it gets cumbersome after a while to figure out the CSS and XPath selectors required. Each time, you need to select one or elements and make sure you got the right ones before interacting with them. For automating a complex website, it soon becomes a non-trivial task. So let us add a scripting interface to our program.

We use Nashorn which is a Javascript engine built into the Java SDK. This allows us to script all facets of controlling a website using this interface.

For doing this, we need a class which exposes the Selenium interface to Javascript. The class is quite simple and just offers convenient methods for working with the WebDriver. Let us call this class Chrome.

public class Chrome
{
    private WebDriver driver = null;

    public void start() { driver = new ChromeDriver(); }
    public void quit() { if ( driver != null ) driver.quit(); driver = null; }
    public void go(String url) { driver.get(url); }
    public List<WebElement> css(WebElement ctx,String css) {
        return ctx.findElements(By.cssSelector(css));
    }
    public List<WebElement> css(String css) {
        return driver.findElements(By.cssSelector(css));
    }
    public List<WebElement> xpath(WebElement ctx,String xpath) {
        return ctx.findElements(By.xpath(xpath));
    }
    public List<WebElement> xpath(String xpath) {
        return driver.findElements(By.xpath(xpath));
    }
    public void send(WebElement ctx,String text) {
        ctx.sendKeys(text);
    }
    public <V> void wait(Function<? super WebDriver,V> func,int secs) {
        new WebDriverWait(driver, secs).until(func);
    }
}

And here is how we integrate this class with Nashorn. The instance of Chrome is exposed using the variable named c.

ScriptEngineManager mgr = new ScriptEngineManager();
ScriptEngine engine = mgr.getEngineByName("javascript");
Bindings bindings = engine.getBindings(ScriptContext.ENGINE_SCOPE);
bindings.put("c", new Chrome());

We also add the ability to load a script file at startup. This allows us to add all commands required to initialize the Chrome Browser to a known state.

String arg = args[0];
try { engine.eval(new FileReader(arg)); }
catch(Exception ex) {
    System.err.println("Error executing " + arg + ": " + ex.getMessage());
}

Next, we add the main REPL (Read-Eval-Print-Loop) to allow us to drive browser from the console.

BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
String line = null;
String prompt = "js >> ";
for (System.out.print(prompt) ;
     (line = in.readLine()) != null ; System.out.print(prompt) ) {
    line = line.trim();
    if ( line.isEmpty() ) continue;
    try { engine.eval(line); }
    catch(Exception ex) {
        System.err.println("Error: " + ex.getMessage());
    }
}

And with these updates, we now have a scriptable interface to our program. Let us try out a few examples.

The following script opens the IMDb website and waits for it to load.

c.start();
c.go('https://www.imdb.com');
c.wait(function(d) {
  print("wait for IMDb to load ..");
  return d.getTitle().startsWith("IMDb")
}, 20);

The following block of code finds the search text field, enters the search term into it (held in the array member, argv[1]) and clicks the submit button.

c.css("#navbar-query")[0].sendKeys(argv[1]);
c.css("#navbar-submit-button")[0].click();

Next is the wait function which waits for the results page to load.

c.wait(function() {
  print("wait for results page to complete ..");
  return ! c.css("h1.findHeader").isEmpty();
}, 20);

6. Download It

You can download the program and play around with the sample scripts or develop your own as shown here. Download it here.

Conclusion

Using Selenium with java is quite easy as shown here. We have added a script interface to make it even easier. Using this interface, we can develop and distribute scripts for automating various aspects of websites.