Introduction
Selenium Official Website: https://www.selenium.dev/
Selenium is a powerful automation testing toolset that encompasses a range of tools and libraries for automating web browsers. It includes the following three projects:
- Selenium WebDriver
- Selenium IDE
- Selenium Grid
The core of Selenium is WebDriver, which can interchangeably run across many browsers, driving the browser in a native way.
The architecture of WebDriver is designed as follows:
For each browser, a corresponding Driver is written, such as ChromeDriver, which serves as the driver to operate the browser and provides various operational interfaces. Selenium designed the WebDriver abstraction to allow various browser drivers to be used through a unified abstraction.
Alternatively, remote access interfaces can be utilized:
Next, the author will introduce how to use Selenium WebDriver to write automated test programs in C#.
Installing Dependencies
Create a C# console project, and first, install the dependency package Selenium.WebDriver
, which provides the basic API for browser driver interfaces and unified abstraction.
Selenium.WebDriver
Next, install the corresponding driver implementation for the browser:
Selenium.WebDriver.ChromeDriver
Simply search for
Selenium.WebDriver
, and then append the suffix based on the browser to download the corresponding browser driver.
First Demo
Open: https://www.selenium.dev/selenium/web/web-form.html
This address is the official test page, which contains many HTML components sufficient for our learning purposes.
The following example includes code to open the page, find elements, fill in content, and retrieve information. Readers can run this code to understand the basic execution process of writing automated test programs; further details will be explained in later sections.
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
class Program
{
static void Main()
{
// Using ChromeDriver to drive
IWebDriver driver = new ChromeDriver();
// Open this page upon startup
driver.Navigate().GoToUrl("https://www.selenium.dev/selenium/web/web-form.html");
// Get page information
var title = driver.Title;
// Implicit wait; page elements won't appear immediately but need a delay
driver.Manage().Timeouts().ImplicitWait = TimeSpan.FromMilliseconds(500);
// Find elements
var textBox = driver.FindElement(By.Name("my-text"));
var submitButton = driver.FindElement(By.TagName("button"));
// Fill text into the input box
textBox.SendKeys("Selenium");
// Click the submit button
submitButton.Click();
// After clicking the submit button, the page will refresh, and at this point, we retrieve elements from the redirected page
var message = driver.FindElement(By.Id("message"));
var value = message.Text;
// Exit
driver.Quit();
}
}
Note: When the demo program starts, it will launch Chrome. If the browser starts too slowly, the demo program may report an error and exit.
Therefore, it is necessary to start the Chrome browser first before launching the demo program to minimize the startup time of the new Chrome browser window.
Once the demo program starts, it will automatically fill out the form and submit it, then redirect to a new page.
Page Load Strategies
There are various developmental models for pages, such as PHP and ASP integrated development where the server renders and returns the entire page, or a front-back separation model where static resources are loaded first, followed by data loaded from the backend API to generate the page.
Often, a page may take time to finish rendering, with some page elements appearing after a delay. When using WebDriver, we can also decide when to start automation based on our needs.
There are three basic load strategies for pages:
| Strategy | Ready State | Remarks |
| -------- | ----------- | ------------------------------------------------------------ |
| normal | complete | Default value, waits for all resources to download. |
| eager | interactive | DOM access is ready, but other resources like images may still be loading. |
| none | Any | WebDriver will not block at all; it only waits for the initial page to be downloaded. |
If the page takes too long to load due to unimportant resources (e.g., images, CSS, JS), the default parameter normal
can be changed to eager
or none
to speed up session loading.
Setting method:
var chromeOptions = new ChromeOptions();
chromeOptions.PageLoadStrategy = PageLoadStrategy.Normal;
IWebDriver driver = new ChromeDriver(chromeOptions);
In addition, WebDriver provides three methods to wait for page elements to appear:
- Explicit Wait
- Implicit Wait
- Fluent Wait
We can use waits to make a findElement
call wait until dynamically added elements in the script are added to the DOM:
WebDriverWait wait = new WebDriverWait(driver, TimeSpan.FromSeconds(10));
IWebElement firstResult = wait.Until(e => e.FindElement(By.XPath("//a/h3")));
This method is known as Explicit Wait.
WebDriver will wait for the element at the path //a/h3
to appear, with a maximum wait time of 10 seconds.
Through Implicit Wait, WebDriver will poll the DOM for a certain period when attempting to find any element. This is useful when certain elements on a webpage are not immediately available and need some time to load.
Implicit wait tells WebDriver to poll the DOM for a specified time while searching for one or more elements that are not immediately available. Once set, the implicit wait is set for the duration of the session.
Set implicit wait polling time:
driver.Manage().Timeouts().ImplicitWait = TimeSpan.FromMilliseconds(500);
Warning: Do not mix implicit and explicit waits. Doing so can lead to unpredictable wait times. For example, setting an implicit wait to 10 seconds, and an explicit wait to 15 seconds, may lead to timeouts after 20 seconds.
Fluent Wait defines the maximum amount of time to wait for a condition and the frequency to check for the condition.
Users can configure waits to ignore specific types of exceptions that may arise while waiting, such as NoSuchElementException
when searching for elements on the page:
WebDriverWait wait = new WebDriverWait(driver, timeout: TimeSpan.FromSeconds(30))
{
PollingInterval = TimeSpan.FromSeconds(5),
};
Proxies
Proxy servers act as intermediaries for requests between clients and servers. Using proxy servers for Selenium's automation scripts may benefit the following aspects:
- Capturing network traffic
- Simulating backend responses from websites
- Accessing target sites under complex network topologies or strict corporate restrictions/policies.
If in a corporate environment, or if a proxy is required to connect to URLs while traveling, a proxy may be necessary.
Selenium WebDriver provides methods to set up proxies, as shown in the code sample below:
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
class Program
{
static void Main()
{
ChromeOptions options = new ChromeOptions();
Proxy proxy = new Proxy();
proxy.Kind = ProxyKind.Manual;
proxy.IsAutoDetect = false;
proxy.SslProxy = "<HOST:PORT>";
options.Proxy = proxy;
options.AddArgument("ignore-certificate-errors");
IWebDriver driver = new ChromeDriver(options);
driver.Navigate().GoToUrl("https://www.selenium.dev/");
}
}
Browser Versions
For example, if you want to use Chrome version 67 on Windows XP:
var chromeOptions = new ChromeOptions();
chromeOptions.BrowserVersion = "67";
chromeOptions.PlatformName = "Windows XP";
Element Operations
Element operations can be mainly divided into the following categories:
- File Upload
- Querying Network Elements: locating elements based on provided locator values
- Web Element Interaction: an advanced command set used to manipulate forms
- Locating Strategies: methods for identifying one or more specific elements in the DOM
- Element Information: properties of HTML elements
Next, different methods for interacting with HTML elements will be introduced.
File Upload
Uploading a file is essentially filling in the local path of the file in an input
tag of type file
. This address needs to be the absolute path of the file.
using System;
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
namespace SeleniumDocumentation.SeleniumPRs
{
class FileUploadExample
{
static void Main(String[] args)
{
IWebDriver driver = new ChromeDriver();
try
{
// Navigate to URL
driver.Navigate().GoToUrl("https://www.selenium.dev/selenium/web/web-form.html");
// The file path must be valid; it’s better to use an absolute path
driver.FindElement(By.Name("my-file")).SendKeys("D:/Desktop/images/学习.jpg");
var submitButton = driver.FindElement(By.TagName("button"));
submitButton.Click();
if (driver.PageSource.Contains("File Uploaded!"))
{
Console.WriteLine("file uploaded");
}
else
{
Console.WriteLine("file not uploaded");
}
driver.Quit();
}
catch (Exception ex)
{
Console.WriteLine(ex);
}
}
}
}
Finding Elements
WebDriver has eight different built-in element location strategies:
| Locator | Description |
| ----------- | ------------------------------------------------------------ |
| class name | Locate elements whose class attribute matches the search value (Compound class names are not allowed) |
| css selector | Locate elements that match the CSS selector |
| id | Locate elements whose id attribute matches the search value |
| name | Locate elements whose name attribute matches the search value |
| link text | Locate anchor elements whose visible text matches the search value exactly |
| partial link text | Locate anchor elements whose visible text partially matches the search value. If multiple elements match, only the first element is selected. |
| tag name | Locate elements whose tag name matches the search value |
| xpath | Locate elements that match the XPath expression |
Here is a use case for finding elements:
// By id or name
IWebElement vegetable = driver.FindElement(By.ClassName("tomatoes"));
IWebElement fruits = driver.FindElement(By.Id("fruits"));
IWebElement fruit = fruits.FindElement(By.ClassName("tomatoes"));
// By css selector
var fruit = driver.FindElement(By.CssSelector("#fruits .tomatoes"));
// Return multiple elements
IReadOnlyList<IWebElement> plants = driver.FindElements(By.TagName("li"));
To get the currently active element on the page:
var element = driver.SwitchTo().ActiveElement();
string attr = element.GetAttribute("title");
Interacting with Page Elements
There are only five basic commands available for element operations:
- Click (applicable to any element)
- Send Keys (applicable only to text fields and editable content,
.SendKeys()
) - Clear (applicable only to text fields and editable content)
- Submit (applicable only to form elements) (not recommended in Selenium 4)
- Select (find elements)
Click
The click event on an element can be triggered:
var submitButton = driver.FindElement(By.TagName("button"));
submitButton.Click();
Input
The element sends the key command, i.e., .SendKeys()
, which is applicable to editable elements such as input and select.
driver.FindElement(By.Name("my-file")).SendKeys("D:/Desktop/images/学习.jpg");
Clear
For editable text or elements with inputs, such as text areas, selection boxes, or file upload boxes, the current value
attribute of the element can be cleared.
IWebElement searchInput = driver.FindElement(By.Name("q"));
searchInput.SendKeys("selenium");
// Clears the entered text
searchInput.Clear();
Get Element Attributes
- Is displayed
- Is enabled
- Is selected
- Get element tag name
- Position and size
- Get element CSS value
- Text content
- Get feature or attribute
In JS, we can retrieve an element's value or other attributes like this:
document.getElementById("my-text-id").value
"111111111"
In WebDriver, element attributes can be accessed through the fields/properties
of the IWebElement interface, but not extensively:
Boolean is_email_visible = driver.FindElement(By.Name("email_input")).Displayed;
Other required attributes can be fetched using methods like GetAttribute
, for example:
string attr = element.GetAttribute("title");
The definition of IWebElement is as follows:
public interface IWebElement : ISearchContext
{
string TagName { get; }
string Text { get; }
bool Enabled { get; }
bool Selected { get; }
Point Location { get; }
Size Size { get; }
bool Displayed { get; }
void Clear();
void SendKeys(string text);
void Submit();
void Click();
string GetAttribute(string attributeName);
string GetDomAttribute(string attributeName);
string GetDomProperty(string propertyName);
string GetCssValue(string propertyName);
ISearchContext GetShadowRoot();
}
Browser Page
The operations on the browser page generally fall into four categories:
-
Open a website
-
Go back
-
Go forward
-
Refresh
The sample code is also quite simple:
// Open
driver.Navigate().GoToUrl(@"https://selenium.dev");
// Go back
driver.Navigate().Back();
// Go forward
driver.Navigate().Forward();
// Refresh
driver.Navigate().Refresh();
User Login Credentials
Currently, only Basic and Cookie authentication methods have been found, while the JWT Token method, which requires setting Header, has not been implemented.
Here is an example of opening a webpage using a Cookie:
var chromeOptions = new ChromeOptions();
IWebDriver driver = new ChromeDriver(chromeOptions);
try
{
driver.Navigate().GoToUrl("https://www.google.com");
// Adds the cookie into current browser context
driver.Manage().Cookies.AddCookie(new Cookie("key", "value"));
driver.FindElement(By.CssSelector("[name='q']")).SendKeys("webElement");
// Get attribute of current active element
var btnK = driver.FindElement(By.Name("btnK"));
btnK.Click();
}
finally
{
driver.Quit();
}
This concludes the tutorial on using C# to develop Selenium WebDriver. Readers can refer to the official documentation for more information.
文章评论