Software testing is continuously upgrading and enhancing to sync with the complexity of the software application being developed. This is to have quality and bug-free applications in the market with a smooth software development process. To meet this, the software testing tool and framework are also being evolved, and Selenium is at the forefront of this change.
Selenium allows browser automation and is regarded as one of the most used automation testing frameworks to test websites and web applications. It was launched in 2004 as "Selenium Core" and later upgraded to different versions viz Selenium 1 (Selenium RC), Selenium 2, and Selenium 3. In 2018, Simon Stewart, a founding member of Selenium, announced the launch of Selenium 4 as the latest version with major updates. On October 13, 2021, Selenium 4.0 was officially released, which can be downloaded from its official website or GitHub.
Selenium, loaded with new features like enhanced Selenium Grid, W3C standardized protocol, new API, relative locator, and others, made it one of the popular choices of testers and Quality analysts. A quick rundown of this is required for individuals eager to learn what is new in Selenium.
In this blog, we will explore and discuss Selenium 4 in detail and, describe its key features, compare it with Selenium 3 - the last most used version.
Let us begin this by first knowing more about Selenium 4.
A brief introduction to Selenium 4
Selenium 4 is the most recent iteration of the widely used open-source test automation framework designed for web applications. Over time, Selenium has undergone different version upgrades with additional and deprecated features.
Selenium 1, a well-established framework, supports multiple browsers through its JavaScript implementation. Selenium 2 integrates WebDriver with Selenium RC, combining the strengths of both and addressing their respective drawbacks. For users of WebDriver APIs, it serves as a seamless replacement, with the key change being the removal of the core and its replacement with the back-end WebDriver. Selenium 3.0 has achieved W3C standard status, positioning itself as a preferred testing tool for both web and mobile applications.
Selenium 4 Beta 1 represents the most recent version with advanced features, including Selenium Grid, W3C compliance, an enhanced IDE, new APIs, and more. The architecture of Selenium Grid has undergone a redesign in the latest releases of Selenium 4. You can run parallel and distributed tests on nodes and as well as hubs. It provides testers with the ability to formulate comprehensive and dependable test scripts for web applications, thereby ensuring the delivery of high-quality software applications.
What is new in Selenium 4?
Selenium 4 introduces valuable improvements, including relative locators, an enhanced Selenium Grid architecture, and an improved Selenium IDE.
However, the most significant change under the hood is the W3C compliance of WebDriver APIs, ensuring more stable and less error-prone cross-browser tests. This eliminates the necessity for encoding and decoding API requests through the JSON wire protocol, as required in Selenium 3 and earlier versions, streamlining communication between browsers and test scripts. This implies direct interaction between WebDriver and the target browser.
Here are some noteworthy changes you'll encounter with Selenium 4:
W3C WebDriver Protocol
Primarily, Selenium 4 WebDriver is entirely W3C Standardized. The WebDriver API, now relevant beyond Selenium, has found application in various automation tools. In Selenium 3.0, tests communicated with the browser at the end node through the JSON wire protocol at the local end, necessitating API encoding and decoding. In Selenium 4, tests directly communicate through the W3C Protocol without encoding or decoding API requests.
Improved Selenium Grid
Selenium Grid is now enhanced with Docker support, simplifying the setup and scaling of Selenium Grid using containers. It also supports IPv6 addresses and HTTPS communication and allows configuration files in TOML format. In Selenium 4, the grid experience is smoother, as there's no longer a need to set up and start hubs and nodes separately. Once you initiate a Selenium server, the grid acts as both a hub and node.
Upgraded Selenium IDE
Selenium IDE support for Chrome is now available for download on their official website. As a record and playback tool, Selenium IDE now features a more advanced set of capabilities, including a new plug-in system. Browser vendors can easily plug into the new Selenium IDE with their locator strategy and plug-in. The new CLI runner, based on NodeJS, supports WebDriver Playback and parallel execution, providing essential information such as time taken and the number of passed and failed test cases.
Enhancements in Selenium Manager
Selenium Manager is now more powerful, featuring an updated interface, real-time monitoring, and advanced query features. These improvements enhance tester control over automated tests, ensuring a seamless and productive experience.
Changes in Action Class
The Actions class is modified in Selenium 4, a class responsible for executing advanced user interactions like mouse and keyboard actions. These modifications enhance the Actions class, boosting its usability and functionality, ultimately making it more potent and user-friendly in automation scripts. In Selenium 4, the Actions class receives updates with several new methods designed for simulating input actions through the mouse and keyboard on web elements. These additional methods offer a more straightforward and intuitive approach to executing common actions. Some of the new methods integrated into the Selenium 4 Actions class include:
-
click(WebElement element): This method replaces the prior approach of moveToElement(element).click() and allows direct clicking on a specific web element.
-
clickAndHold(WebElement element): This method replaces the previous approach of moveToElement(element).clickAndHold() and permits clicking and holding on a specific web element without releasing the click.
-
contextClick(WebElement element): This method replaces the previous approach of moveToElement(element).contextClick() and enables executing a right-click operation on a specific web element.
-
doubleClick(WebElement element): This method replaces the earlier approach of moveToElement(element).doubleClick() and facilitates performing a double click on a specific web element.
-
release(): Initially part of the org.openqa.selenium.interactions.ButtonReleaseAction class, this method has been relocated to the Actions class in Selenium 4. Its purpose is to release the pressed mouse button after executing an action.
What is deprecated in Selenium 4?
In Selenium, some deprecations have been made as follows:
FindsBy
The FindsBy interface, found in the org.openqa.selenium.internal package and implemented by the RemoteWebDriver class, now undergoes modification. The associated methods include findElement(By) and findElements(By).
Note: End users remain unaffected by these alterations. By using the By class, you can still use findElement(By) and findElements(By).
Fluent Wait
Adjustments have been made to methods within FluentWait, such as withTimeout() and pollingEvery(). In Selenium 4, modifications have been made to the methods withTimeout() and pollingEvery(), which belong to the FluentWait class. The pollingEvery() method now exclusively accepts a single parameter – Duration. This Duration can be specified in Seconds, MilliSeconds, NanoSeconds, Hours, Days, and so forth. Similarly, the withTimeout() method also accepts only one parameter – Duration.
Reference example:
Driver Constructors
Several driver constructors have been deprecated. Capabilities objects are now substituted with Options. To work with the Driver class, you must create an Options object.
-
FirefoxDriver Capabilities are now replaced by FirefoxDriver FirefoxOptions.
-
ChromeDriver Capabilities are now replaced by ChromeDriver ChromeOptions.
-
InternetExplorerDriver Capabilities are now replaced by InternetExplorerDriver InternetExplorerOptions.
-
SafariDriver Capabilities are now replaced by SafariDriver SafariOptions.
-
EdgeDriver Capabilities are now replaced by EdgeDriver EdgeOptions.
Key Features in Selenium 4
Some of the key features of Selenium that can be leveraged for user-centric testing are as follows:
-
Selenium 4 comes with native support for Chrome DevTools Protocol, enabling QA engineers to utilize Chrome development properties and leverage APIs provided by Chrome DevTools for improved testing and bug resolution.
-
The latest release brings improvements to request tracing and logging with hooks, providing automation engineers with enhanced debugging control.
-
Selenium 4 now enables simultaneous operation on two different windows, proving advantageous when navigating to a new window or tab, opening a different URL, and performing actions. To create and switch to a new window in Selenium 4, use the newWindow() method with WindowType.WINDOW. Each browser window in Selenium has a unique WindowHandle or ID, which should be passed to the switchTo().window() method for switching between different windows.
-
Selenium 4 documentation comprehensively covers Selenium Grid 4, Selenium IDE, and WebDiver W3C protocol. Automation testers can refer to the documentation to familiarize themselves with the new APIs offered by Selenium 4.
-
Relative locators, also known as Friendly locators, assist in locating WebElements near, left of, right of, above, and below a specified element. The relative locator methods are compatible with withTagName (in Selenium 4 Java) or with_tag_name (in Selenium 4 Python). Selenium 4 introduces five relative locators:
-
above(): Utilized for identifying web elements positioned immediately above a specified element. Syntax: To identify elements above a given locator: above(By locator) To identify elements above a specific web element: above(WebElement element)
-
below(): Employed to pinpoint web elements located just beneath the specified element. Syntax: To pinpoint elements below a particular locator: below(By locator) To pinpoint elements below a specific web element: below(WebElement element)
-
toLeftOf(): Deployed to spot web elements positioned to the left of the designated element. Syntax: To spot elements to the left of a specific locator: toLeftOf(By locator) To spot elements to the left of a particular web element: toLeftOf(WebElement element)
-
toRightOf(): Utilized for identifying web elements situated to the right of the specified element. Syntax: To identify elements to the right of a given locator: toRightOf(By locator) To identify elements to the right of a specific web element: toRightOf(WebElement element)
-
near(): Applied to locate web elements positioned approximately 50 pixels away from the designated element. The distance can be specified as an argument in a method with multiple options. Syntax:
near(By locator)
near(WebElement element)
near(By locator, int atMostDistanceInPixels)
near(WebElement element, int atMostDistanceInPixels)
-
-
Now, utilizing the getFullPageScreenshotAs() method in Firefox, it's possible to capture complete page screenshots. However, instead of casting it to the 'TakesScreenshot' interface, the casting should be done to the FirefoxDriver instance. Example:
File src = ((FirefoxDriver) driver).getFullPageScreenshotAs(OutputType.FILE);
-
The Chrome driver class in Selenium 4 extends to the Chromium Driver. Within the Selenium 4 Chromium driver class, there are predefined methods facilitating access to the development tools. Through the API, various operations can be performed, including enabling the network offline, enabling the network online, retrieving console logs, and loading a secure website.
-
Selenium WebDriver 4 introduces enhanced error handling and reporting mechanisms, simplifying the identification and resolution of issues during test execution.
-
Selenium 4 offers improved logging and debugging capabilities, empowering QA engineers to diagnose and address issues more efficiently. It supports the Chrome Debugging Protocol, enabling interaction with the Chrome DevTools API and direct access to advanced debugging features from Selenium tests.
Selenium 3 vs. Selenium 4
The key differences between Selenium 3 and Selenium 4 will help you know the exact upgrade made in Selenium. Some of those are as follows:
Feature | Selenium 4 | Selenium 3 |
---|---|---|
Protocol Used | W3C standard protocol | JSON wire protocol |
Chrome Driver Class Inheritance | Chrome Driver class extends chromium driver class | Chrome Driver class extended Remote WebDriver class |
Selenium Grid | Optimized with enhanced GUI and Docker support | No Support for Docker |
Selenium IDE | Enhanced with improved GUI and cloud-based Selenium Grid | Selenium IDE available as a Firefox add-on |
Automation Testing Setup | Testers do not need to start Hub and Node jars every time | Testers always had to start Hub and Node jars, a difficult task in Selenium 3 |
Ease of Use | Improved user experience with simplified setup | Complex setup with manual initiation of jars |
Driver Compatibility | Better compatibility with various browsers | Limited compatibility, especially with newer browsers |
Remote WebDriver Functionality | Utilizes updated Remote WebDriver functionality | Relies on older Remote WebDriver functionality |
Execution Efficiency | Enhanced efficiency with optimized Selenium Grid | Limited efficiency due to lack of Docker support |
Selenium IDE Accessibility | Accessible across different browsers and platforms | Limited accessibility as a Firefox add-on |
Maintenance Overhead | Reduced maintenance overhead with Docker support | Higher maintenance overhead without Docker support |
Documentation and Community Support | Improved documentation and active community support | Documentation may be outdated, limited community support |
Overall Advancements | Embraces modern technology standards and advancements | Relatively stagnant in terms of technological enhancements |
How Do We Migrate From Selenium 3 to Selenium 4?
Now that you are familiar with Selenium 4 and the new changes made in the recent release, to leverage the capabilities of Selenium 4, you have to migrate from Selenium 3 to Selenium 4. You can follow the mentioned general steps:
-
First, you have to be aware of the key features of Selenium 4, new improvements, and changes. For this, you can refer to the official documentation of Selenium for more information.
-
Now, you have to update the dependencies of the software project to use Selenium 4 libraries. This means that you have to update Selenium WebDriver, Selenium Servers, and other related dependencies in your project's build configuration (e.g., Maven, Gradle).
-
Change your Selenium 3 codebase to allow better adaptation to the new changes in Selenium 4.
-
We all know that Selenium 4 makes use of the W3C WebDriver protocol. Thus, you have to ensure that your code is compatible with this protocol because it may differ from the JSON Wire protocol, which is used in Selenium 3.
-
You have to update the browser drivers (e.g., ChromeDriver, GeckoDriver, etc.) to versions compatible with Selenium 4. Check the official browser driver documentation for the recommended versions.
-
Thoroughly test your application using Selenium 4. You should execute your existing test suites to identify any compatibility issues and update test scripts as needed.
-
If you use Selenium Grid, make sure that your grid setup is compatible with Selenium 4. Update the Selenium Grid dependencies and configurations accordingly.
-
In case you have already used Selenium Grid with Hub and Nodes or Selenium IDE, then you have to update those to its latest version that supports Selenium 4.
-
You should update your internal documentation and provide training to your team members regarding the changes introduced in Selenium 4. This ensures everyone is on the same page during the migration process.
-
Consider a gradual rollout strategy. You can begin with a smaller subset of tests or a non-production environment to identify any issues before fully migrating your entire test suite.
-
You need to monitor the test execution and application behavior post-migration. Address any issues that arise and iterate on the migration process as needed.
Always check the official Selenium documentation and release notes for the most accurate and up-to-date information during the migration process.
Conclusion
Discussion in this blog around Selenium 4 clarified that it is one of the best frameworks in web automation that leverages browser support, W3C protocol integration, and Selenium Grid for better scalability. The new release of Selenium 4 Beta allows developers and software testers to have a better software testing experience and ensure the quality of the software applications.
As Selenium 4 comes with a Selenium WebDriver architectural shift, this gives a more stable test automation platform. This eases the testing process and makes debugging and product enhancement simple.