The Toughdata Blog

Technical and practical perspectives on web scraping and AI.

How To Use YOLOv9 With OpenCV for Java

It took me about a week to get YOLOv9 running with OpenCV's DNN module in Java, due to sparse documentation. I'm hoping this article will spare you some time and provide all the information you need.


Aug 15, 2024

A New Alternative To FlyCaptcha For TikTok Captcha Recognition

I'd like to talk a little bit about a tool I made that will help developers automate processes on TikTok even better than before.


Jun 11, 2024

Create TikTok Accounts In Bulk With Hyperaccs

Digital marketers and web scrapers will love this tool to register accounts in bulk


Apr 14, 2024

How To Write A Spintax Parser From Scratch In Python

Curious about how spintax works? This tuturial will show you just how to write a spintax processing algorithm in Python for integration into your outreach programs.


Mar 18, 2024

How To Type-Hint SQLAlchemy Join Queries

Properly type hinting join queries is not well documented or even spoken of. This article seeks to dispel some confusion around the topic.


Feb 21, 2024

How To Fine-Tune FLAN-T5 For Question Answering

Flan-T5 is a powerful text-to-text model that excels in summarization, translation, and question answering. Today we will be fine-tuning it on over 56,000 question answer pairs scraped from Quora.


Aug 24, 2023

Scraping Instagram Profiles At Scale With Lamadava

When it comes to scraping instagram... How fast is fast? How cheap is cheap? How much data do you need? Lamadava and Datalama may have the answers you need.


Jul 19, 2023

How To Fine-Tune RWKV On A New Dataset

RWKV is the latest advancement in language modeling. It is extremely fast, powerful, and efficient. This post shows you that it can also be simple to train it on a new corpus using the HuggingFace Transformers library.


Jul 17, 2023

Scraping Social Media With Datacenter Proxies

If you are a small business who needs to scrape TikTok, Instagram, or other social media, you have probably heard the mantra that you need residential proxies. If you are unable to pay the premium for residential proxies, I have good news for you. Depending on your use-case, a solid datacenter proxy can get the job done. Do your research and figure out what works best before shelling out your cash.


Jul 11, 2023

Scraping Influencer Follower Data For Lead Generation

Lead generation is a classic use case for web scraping. In this post I will describe how to scrape a TikTok influencer's followers for like-minded individuals to reach out to.


Jul 02, 2023