🔎 "Scrape Web Text and Pull Out Only the Digits!"
🧩 What’s the Purpose of This Program?
This Java program leverages Selenium WebDriver to:
Open a web page in Chrome
Find a specific web element by XPath
Grab the text inside that element
Filter and extract only the numeric characters from that text
Ideal for scenarios where you want to scrape numbers like years, prices, IDs from UI elements on web pages!
🧠 Step-by-Step Breakdown — What’s Happening Behind the Code?
1️⃣ Initialize ChromeDriver
Launches a new Chrome browser instance via new ChromeDriver().
Make sure chromedriver executable is accessible on your system PATH or provide its location explicitly.
2️⃣ Open the Target Webpage
The program directs the browser to:
https://opensource-demo.orangehrmlive...
You can swap this URL with any page you want to scrape.
3️⃣ Wait for Page to Load
Uses Thread.sleep(5000) to pause for 5 seconds, giving the page enough time to render.
Pro tip: Use explicit waits (WebDriverWait) for smarter waiting in real projects.
4️⃣ Locate the Target Element
Finds a p element with exact text "© 2005 - 2025 " using XPath:
//p[text() = '© 2005 - 2025 ']
Adjust this locator depending on the element you want.
5️⃣ Get Text Content
Retrieves the full text inside the element, such as "© 2005 - 2025 ".
6️⃣ Extract Only Numbers
Uses regex replacement:
replaceAll("[^0-9]", "")
This removes everything except digits, leaving "20052025".
7️⃣ Print the Numbers
Outputs the cleaned string containing just the digits.
8️⃣ Close the Browser
Calls driver.quit() to gracefully close the browser session and free system resources.
🎓 Top 5 Interview-Ready Java & Selenium Q&As
1. Q: How do you extract specific parts of a web element’s text in Selenium?
👉 Use getText() to get the string, then apply Java regex methods like replaceAll().
2. Q: Why is waiting important before interacting with elements?
👉 To ensure the element is fully loaded and avoid NoSuchElementException or stale references.
3. Q: What does the regex [^0-9] mean?
👉 It matches every character except digits, so replacing these with empty strings leaves digits only.
4. Q: How do you ensure the ChromeDriver works with your Selenium tests?
👉 Keep the chromedriver executable in your system PATH or specify its path in your code setup.
5. Q: Why call driver.quit() at the end of tests?
👉 It closes all browser windows and ends the WebDriver session, preventing resource leaks.
🧾 Conclusion: Extracting Numbers From Web Text Simplified
By combining Selenium's powerful browser automation with Java’s string manipulation, you can quickly isolate numbers embedded in any web element text. This technique is essential for web scraping, automated testing, and data validation workflows.
🏷️ Hashtags to Boost Your Java & Selenium Skills
#JavaAutomation, #SeleniumWebDriver, #WebScraping, #RegexInJava, #ExtractNumbers, #JavaProgramming, #ChromeDriver, #CodingInterview, #TestAutomation, #SoftwareTesting, #WebDriverWait, #ProgrammingTips, #QAEngineer, #AutomationTesting, #JavaRegex, #CleanCode, #EfficientCoding, #TechInterviewPrep, #JavaSelenium, #DataExtraction
Информация по комментариям в разработке