What was the goal of this tire retailer project?

To collect tire pricing and shipping data to stay competitive.

How was the pricing data collected?

Through weekly crawls of major tire retailers and competitors.

Why was this product data complex?

Some sites hid prices behind logins or ZIP searches, creating millions of rows.

How did Ficstar track multiple sellers?

Each seller’s price and rank were captured for every SKU.

What made pricing data harder to extract?

Captchas, add-to-cart pricing, and tire sets needing special parsing.

How was retail data accuracy checked?

Using cached pages, timestamps, and regression testing.

Why is regression testing important?

It spots pricing errors and confirms real product changes.

What insights came from this data collection?

The retailer found price gaps and MAP issues fast.

How did it help other projects?

Improved Ficstar’s product matching and QA for future clients.

What was the overall impact?

It proved Ficstar’s strength in large-scale pricing and retail data.

How We Collected Nationwide Tire Pricing Data for a Leading U.S. Retailer

Scott Vahey
16h
8 min read

Through this project, we helped a leading U.S. tire retailer monitor nationwide pricing and shipping data from 20 major competitors, covering over 50,000 SKUs and generating roughly one million pricing rows per weekly crawl.

The challenges included add-to-cart pricing, login-required sites, captchas, and multi-seller listings, all of which required adaptive algorithms, caching, and contextual parsing to ensure 99% accuracy.

Our QA framework, built around cached validation and regression testing, became a standard for future projects, while the NLP-based product-matching and multi-seller ranking systems we developed now power other Ficstar pricing intelligence solutions across multiple industries.

The project strengthened relationships with manufacturers interested in MAP compliance and demonstrated how a reliable, large-scale data pipeline can give retailers a lasting competitive advantage.

A Nationwide Pricing Intelligence System

The core objective was clear: gather tire pricing data and shipping costs across the United States, covering 20 national competitors. The client wanted to ensure that their retail prices were equal to or lower than anyone else in the market.

In addition to that, we handled several smaller but equally important tasks:

Monitoring MAP (Minimum Advertised Price) compliance
Comparing installation fees between retailers
Capturing entry-level pricing for every tire size

These weren’t one-off crawls; they required automated systems running on schedules, data normalization processes, and ongoing adjustments as websites changed. The goal was to provide a complete and accurate pricing picture, daily, weekly, and during key promotional periods.

Scale and Complexity

The scale was massive. We were dealing with roughly 50,000 unique SKUs, and for each of those, we had to collect data from multiple competitors across different ZIP codes.

Some retailers changed prices depending on region or shipping distance, so we built our system to query up to 50 ZIP codes per site. That resulted in roughly 1 million pricing rows per crawl, and that’s before accounting for multi-seller listings or bundle variations.

We ran full-scale crawls every week, but we also scheduled ad-hoc crawls during holidays to capture time-sensitive sale prices, especially during major events like Black Friday, Labor Day, and Memorial Day. These snapshots gave our client the ability to see not only baseline pricing but also promotional trends across the industry.

One of the biggest challenges early on was that many competitors didn’t display prices until after the product was added to the cart. That meant our crawlers had to mimic user behavior, navigating the site, selecting tire sizes, adding items to the cart, and then scraping the “real” price from inside the checkout flow.

Some sites even required account logins, so we had to handle session management carefully to maintain efficiency without violating site restrictions or triggering anti-bot mechanisms.

Captchas, Sellers, and Hidden Prices

This project was unique in that nearly every target website required a different approach. From the structure of product pages to the anti-bot systems they used, no two domains behaved the same way.

1. Captchas and Blocking

Several competitors used “Press and Hold” captchas, which slow down crawls dramatically because they require interaction per request. We had to fine-tune thread management and proxy rotation to maintain speed while keeping success rates high.

Blocking was an ongoing issue. I often joke that “blocking is just a feedback mechanism”, it tells you what needs improvement. We made constant updates to our algorithms, request timing, proxies and header management to keep crawls running smoothly.

2. Product Format Challenges

Tire listings were another source of complexity. Some prices were for a single tire, some for a pair, and others for a set of four. Unfortunately, that information wasn’t always in a structured format, but it was often hidden inside the product title.

That meant we had to write parsing rules that analyzed product names to determine what the price actually referred to, and then calculate a normalized price per tire.

3. Multiple Sellers per Product

Another tricky layer came from multi-seller marketplaces. Each tire listing could have multiple sellers, each offering different prices and shipping options.

For that reason, our crawlers had to capture a row for every seller, including their price, rank, and stock availability. We also discovered that the “Rank 1” seller wasn’t always the cheapest, so we developed comparison logic to ensure the lowest price was always returned.

4. Duplicate URLs

It wasn’t uncommon for the same tire product to appear under several URLs on a single site. We implemented internal comparison scripts to identify duplicates and determine which version offered the best price.

5. Frequent Price Fluctuations

Tire prices change constantly including shipping costs, regional taxes, and promotions which all affect the final price. To ensure we were capturing accurate, time-bound data, every crawl stored cached pages and timestamps. This way, if a question arose later, we could always go back and confirm what the price was at that exact moment.

QA and Regression Testing

With over a million pricing rows per week, accuracy wasn’t optional, it was everything. That’s where our quality assurance framework came in.

We approached QA in several layers:

Cached Pages: Every page we crawled was stored with a timestamp, ensuring that if prices were questioned later, we could show proof of what was captured at that time.
Regression Testing for Prices: We compared current prices to previous crawls. If a price suddenly dropped 80% or doubled overnight, it triggered an anomaly flag for human review.
Regression Testing for Product Matching: We constantly checked matching rates to make sure that missing SKUs were actually unavailable on competitor sites, not just skipped due to crawler issues.

This mix of automation and manual verification helped us consistently achieve a 99% accuracy across millions of rows, a benchmark we now use in other enterprise projects.

Turning Data Into Strategy

The data we delivered was more than a spreadsheet, it was a competitive strategy engine. The client could instantly see how their prices compared to 20 competitors in every ZIP code, and whether they were above or below the market average.

We also gave them visibility into:

Shipping cost differences
MAP violations by sellers
Price rank by seller on major marketplaces
Regional price variations and how they affected conversions

This level of granularity allowed the client to adjust their prices faster and smarter. They could identify gaps before competitors reacted and maintain pricing leadership nationwide.

What we found most satisfying was seeing how our work directly influenced real-world business decisions. The main goal was helping a national retailer stay competitive every single day.

Unexpected Outcomes and Industry Impact

One of the best parts of this project was the ripple effect it created. Because of how successfully it ran, our work got the attention of tire manufacturers interested in MAP (Minimum Advertised Price) compliance monitoring. They wanted to ensure resellers weren’t advertising below approved thresholds, a task our crawlers were already optimized for.

This project also proved that the frameworks we built for tires, handling multi-seller listings, frequent price changes, and complex product formats, could easily apply to other industries.

Since then, we’ve used the same methodologies in projects for:

Consumer electronics (multiple sellers, frequent promotions)
Home improvement and hardware (regional pricing)
Appliances and automotive parts (bundle-based pricing)

Every one of those projects benefited from the tire industry groundwork.

Lessons Learned and Frameworks

There are several technical and process lessons I’ve carried forward from this project:

Caching as a QA Tool: Caching isn’t just a backup, it’s a transparency layer that builds client trust.
Context-Aware Parsing: Product names often hide essential data; parsing them intelligently with NLP improves accuracy.
Regression Testing as a Habit: Automated regression testing for both price and product match rates is now standard on all large-scale projects.
Multi-Seller Handling: Having structured ranking and pricing logic for multiple sellers gives a more realistic view of market competition.
Anomaly Detection: Tracking sudden data shifts automatically saves hours of manual QA work and keeps clients confident in the dataset.

These have all become part of Ficstar’s standard enterprise pricing intelligence toolkit.

Infrastructure and Automation

Running weekly nationwide crawls at this scale requires serious infrastructure.

We used a distributed crawling system, thousands of threads running in parallel, load balancing and rotating proxies to stay efficient.

Each dataset contained:

SKU and brand identifiers
Competitor and seller info
Single tire pricing
Shipping costs per ZIP code
Stock status
Timestamp

All this data was normalized, validated, and stored in our internal warehouse. Once QA was complete, we pushed the cleaned data to dashboards and API endpoints for client consumption.

Automation was critical. Every process, from scheduling crawls to QA regression, was automated with monitoring alerts. If anything broke or slowed down, I’d know about it in real time.

Adapting to Market Dynamics

The tire market is highly seasonal, and pricing changes dramatically around holidays. That’s why ad-hoc crawls were essential.

Running additional crawls during holiday sales let us capture short-term price cuts that often influenced long-term strategies. These short-term snapshots helped the client understand how competitors behaved during major sales events and how deeply they discounted certain SKUs.

By comparing these temporary price changes against the baseline data, we were able to provide insights into which competitors were aggressively using promotions and which relied more on steady pricing.

The Data Lifecycle

Every crawl followed a strict pipeline:

Data Capture: The crawler visited each product page, handling logins, captchas, and cart additions.
Extraction and Normalization: The raw data was parsed into structured fields,SKU, price, seller, region, etc.
Validation: We ran regression tests and anomaly checks against historical data.
Storage: Cleaned data was stored with time-based indexing for version tracking.
Delivery: The final datasets were delivered through dashboards, APIs, and direct downloads.

That consistency, week after week, was what turned a raw dataset into an actionable pricing intelligence system.

Collaboration and Partnership

Large-scale projects like this depend on collaboration. Throughout the process, we worked closely with the client’s analytics team, discussing anomalies, refining the matching logic, and aligning schedules.

One thing I’ve learned over time is that enterprise web scraping isn’t just about code, it’s about communication. Websites change, requirements evolve, and priorities shift. The only way to keep a project like this running smoothly is by maintaining open dialogue and flexibility.

That strong collaboration helped us build a lasting partnership that extended beyond this single project.

Reflections

Looking back, this project pushed every aspect of our technical and analytical capabilities. It challenged our infrastructure, QA processes, and creativity in problem-solving.

It also reaffirmed something I believe deeply: data quality matters more than quantity.

Collecting millions of rows is easy. Ensuring those rows are accurate, contextual, and usable is where the real value lies. Through continuous adaptation, whether it was battling captchas, parsing product names, or building smarter matching systems, we transformed raw web data into something meaningful: a real-time pricing intelligence tool that gave a national retailer a measurable competitive edge.

The lessons from this project continue to shape how we approach data collection. Today, our focus is on making crawlers even smarter, integrating AI-driven anomaly detection, dynamic rate-limiting, and automated schema recognition to handle evolving website structures.

Our goal is to get as close to 100% accuracy and uptime as possible, no matter how complex the site. Every improvement we make across projects comes from what we’ve learned here.

Key Takeaways

The primary goal of this project was to collect and analyze tire pricing and shipping costs nationwide to ensure the client maintained competitive pricing across all major online retailers. Secondary goals included monitoring MAP compliance, tracking tire installation fees, and identifying entry-level pricing by tire size.

Nationwide Competitive Monitoring: Ficstar collected tire pricing and shipping data across the U.S. from 20 major competitors, helping the client ensure their prices stayed equal to or lower than competitors in every ZIP code.

High-Volume Data Collection: Over 50,000 SKUs were tracked across 1 million pricing rows per crawl, with weekly updates and ad-hoc crawls during holidays to capture time-sensitive promotions.

Complex Technical Environment: Websites required “add to cart” pricing visibility, login-only access, and handling of multiple sellers per product, demanding adaptive crawling logic and ongoing algorithm updates.

Advanced QA Framework: Cached pages, regression testing for price changes and product availability, and historical comparison ensured 99.9%+ data accuracy at scale.

Scalable and Reusable Methodology: The data-matching, QA, and multi-seller ranking systems developed for this project are now standard across Ficstar’s enterprise pricing solutions.

Cross-Industry Applications: Insights from this tire project have since been applied to other industries, such as consumer electronics, home improvement, and retail, enhancing Ficstar’s ability to handle large-scale, multi-seller ecosystems.

Stronger Client Relationships: The collaboration generated industry referrals, including tire manufacturers interested in MAP compliance monitoring, expanding Ficstar’s network in the automotive space.