
Man scrapes 4.1 million jobs with ChatGPT, turns data into new hiring platform
Using the ChatGPT API, he managed to scrape 4.1 million jobs directly from company websites, which were not even on Indeed or LinkedIn.CHATGPT AS A DATA CLEANERThe problem with scraping job listings is that every company has a different format on their website. One might list the title and salary upfront, another hides it at the bottom, and others use completely custom layouts. That makes it hard to collect large amounts of jobs in a structured way.Enter ChatGPT.After experimenting with the ChatGPT API, the Reddit user realised he could dump messy, raw job postings into the model and ask it to return neatly formatted information in JSON, a standardised format designed to simplify and streamline job descriptions. This included details like job title, years of experience, salary, location, and whether the role was remote.In his words: 'After playing with ChatGPT's API, I realised that you can effectively dump raw job descriptions and ask it to give you formatted information back in JSON (ex salary, yoe, etc).'This was the breakthrough that made large-scale scraping possible.4.1 MILLION JOBS, 220K REMOTEUsing this method, he scraped 4.1 million jobs directly from company websites. Out of these, more than 220,000 were remote positions -- something jobseekers are increasingly after.He then built a platform called Hiring.Cafe, where users can search these listings with filters far more powerful than what LinkedIn or Indeed usually offer. You can filter by job title, exclude irrelevant keywords, and even slice by years of experience.'Update: I've now used this technique to scrape 4.1 million jobs (with over 220k remote jobs) and built powerful filters. I made it publicly available here in case you're interested (Hiring.Cafe),' he announced on Reddit.IS THIS REPEATABLE FOR OTHERS?The big question many asked in the Reddit thread was: can others do the same thing?The answer is yes -- but with caveats. The method itself is straightforward:Scrape job listings from company websites.Feed the raw text into ChatGPT's API.Get back structured data (JSON) with clean labels like salary, years of experience, etc.Store that data in a searchable database.advertisementBut the Reddit user also pointed out that the process is computationally heavy and expensive at scale. While it's easy to try on a small sample of job postings, running millions of listings through ChatGPT's API requires both infrastructure and money.To make sure the jobs were legitimate, he also cross-referenced company data using Apollo.io and Dun & Bradstreet, filtering out shady agencies or duplicates. He noted in his post that this step made his database more reliable than simply scraping LinkedIn.In his Reddit post, he even shared the link to his ChatGPT prompt that he used to scrape millions of job postings.USERS NOTICE THE DIFFERENCEEarly users of Hiring.Cafe said the difference was obvious: fewer ghost jobs, fewer spammy recruiter listings, and more real roles posted directly by companies.There are still occasional errors. Sometimes a job gets wrongly tagged as remote, or salary details are missed. But compared to the bigger problem of ghost postings, these are minor.As the Reddit user summed it up: 'The jobs themselves are real and posted directly by the companies. That's what matters most.'A GLIMPSE INTO AI-POWERED JOB SEARCHThis project shows how tools like ChatGPT can do more than just write essays or code snippets. With the right workflow, they can structure messy human data at scale -- turning the chaos of job listings into something useful.And for jobseekers who've wasted hours scrolling through fake postings, this AI-powered shortcut might feel like a breath of fresh air.- EndsMust Watch

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Time of India
2 hours ago
- Time of India
Salary at 10 am, resignation at 10:05 am: Toxic culture or smart timing? Bengaluru CA-turned-founder decodes
Not every resignation is the same Internet reacts Payday usually brings cheer, but for many employees, it's also the finish line. The moment their salaries hit, resignation letters start flying. A viral LinkedIn post recently captured this exact phenomenon — salary at 10:00 am, resignation at 10:05 am. While it sparked endless debates online about loyalty, professionalism, and workplace culture, Bengaluru-based CA-turned-founder Meenal Goel stepped in with her take, and it hit a nerve with explained that these lightning-fast exits aren't always about 'grabbing the money and running.' More often, they're the result of employees holding on until they've been paid what they're owed before finally walking away from an environment that has drained them. 'I've done this myself,' she admitted, recalling a previous job where she repeatedly raised issues about workload, lack of support, and unclear expectations — only to be ignored. Months later, she resigned right after real problem, she argued, lies in companies treating exit interviews as their first real listening exercise. By the time an employee is out the door, it's already too late. 'If someone leaves because of you, that's a moment for the company to reflect, not just 'collect feedback,'' Goel she was quick to point out that not every resignation after payday signals toxicity. Sometimes, it's just better timing or the arrival of a new opportunity. Companies, of course, lose trust, money, and time when employees walk without notice, but if the trend keeps repeating, Goel posed a hard question: 'What made them stay only until payday in the first place?'Many agreed that quitting right after payday is less about the five minutes between salary credit and resignation, and more about the months or years leading up to it. Users pointed out reasons like broken promises at the time of hiring, managers dodging hard conversations until notice was given, stalled recognition, heavy workloads, and unchecked office politics. While some acknowledged it can simply be about timing or a better offer, many argued that recurring patterns signal deeper problems like cultural misalignment, burnout, or leadership blind spots. In truly healthy workplaces, people don't wait for payday to escape — they stay because they genuinely want to.


NDTV
2 hours ago
- NDTV
How To Instantly Erase Your ChatGPT History Before It's Too Late
Are you one of the millions using OpenAI's ChatGPT? While this powerful AI can write anything from a quick email to an entire article, a crucial question is being asked by users worldwide: where does all that data go? Your chat history, filled with personal and professional queries, is saved by default. But a simple solution exists to wipe the slate clean and protect your privacy. We've got the foolproof, step-by-step guide to help you take back control of your conversations on any device. On Your Phone: The Quick Privacy Button You Didn't Know You Needed For mobile users, deleting your digital footprint is easier than you think. A few taps can ensure that your past chats are no longer visible. Here's how to do it: 1. Open the ChatGPT app on your smartphone. 2. Tap the menu icon (two lines or a single horizontal line) in the top-left corner to access the sidebar. 3. Tap on your profile icon or name at the bottom of the screen. 4. Navigate to "Data Controls". 5. In this menu, you'll find the option to "Clear Chat History". 6. Tap it and confirm your decision. Your chat history on the app will be wiped clean. On the Web: A Single Click to a Clean Slate If you use ChatGPT primarily on your computer, the process is just as simple. You can delete your entire history in one go. Here's the simple method: 1. Go to the ChatGPT website and log in. 2. Click your profile icon in the bottom-left corner of the screen. 3. A menu will appear. Click on "Settings". 4. In the settings menu, look for the "Data Controls" tab. 5. Inside the "Data Controls" section, click on the button that says "Delete All Chats". 6. Confirm your action, and all your past conversations will be gone from the web interface. While this method clears your visible history from both the app and website, it is important to remember that OpenAI's help centre states that chats are scheduled for permanent deletion from their systems within 30 days. This means some data may be retained temporarily for security or legal obligations. For those who want to prevent their data from being used for model training, there is also an option in "Data Controls" to turn off "Chat history & training".


Mint
3 hours ago
- Mint
AAP MP Raghav Chadha demands free ChatGPT, Gemini, for all Indians: ‘Next step towards digital democracy'
Aam Aadmi Party (AAP) MP Raghav Chadha on Wednesday called for the Centre to make the advanced versions AI tools including ChatGPT, Gemini, Claude and others free for Indians, citing that it could revolutionise farming, education, business and healthcare among others. Speaking at the Rajya Sabha, Chadha noted that countries like UAE, Singapore and China are already offering free access to advanced AI models to their citizens, while Indians are at a risk of losing the $15 trillion global AI race by 2030. AI is not 'not just a technology but an opportunity to dream big and accomplish those dreams,' the AAP MP from Punjab noted. He claimed it could revolutionise farming, education, business and healthcare, boost productivity, save time, and ensure that 'not a single Indian is left behind in this AI revolution.' Also Read | Is an AI winter upon us? There seems a chill in the air 'AI is not just a technology, it is an opportunity—an opportunity to move forward, to dream big, and to fulfil those dreams,' Chadha said. For farmers, AI can mean smart farming, for students it can act as a 24x7 tutor, for entrepreneurs it can be a business planner and so on, he said. 'Every question has an answer, every problem has a solution within AI,' Chadha told the Rajya Sabha. Raghav Chadha bats for Digital India using AI Calling upon the government to take steps, Raghav Chadha said, 'If the Government of India provides advanced generative AI tools—like ChatGPT, Gemini, Grok and Claude—to every citizen, across all age groups and social classes, in local languages, free of cost and in a safe manner, it can multiply our national productivity manifold and save people's precious time.' Terming it the 'next step towards digital democracy,' Chadha emphasised that in this revolution of AI, not a single Indian should be left behind. 'Only then can India truly become a Digital India. My appeal to the Government is to give this matter serious attention,' he added.