Introduction
This project focuses on researching the alignment between the technical skills taught in universities and those required by the job market, specifically for Software Engineering roles. The LinkedIn Jobs Scraper was developed to automate the collection of job postings and analyze the technical skills mentioned in them.
Objectives
- Scrape job postings from LinkedIn for Software Engineering roles in the USA.
- Analyze the skills required in job descriptions using Natural Language Processing (NLP).
- Compare these skills with the curriculum data from selected universities.
- Provide insights on how well the university curriculum aligns with current market demands.
Technical Approach
We used Python's requests
and BeautifulSoup
libraries to scrape job listings, and SpaCy
for natural language processing. The data collected was stored in CSV format and later analyzed using pandas for insights. The project also leveraged a dictionary of technical skills and university curriculums scraped from publicly available PDFs.
Challenges and Solutions
The primary challenge was accurately extracting technical skills from job descriptions, which often included jargon and company-specific terms. To address this, we used a predefined list of technical skills and utilized NLP models for pattern matching. We also faced issues with CAPTCHA while scraping LinkedIn, which was mitigated by using rotating user agents and proxies.
Job Postings Table
Below is a sample table showcasing the job postings scraped from LinkedIn, including the job title, company, location, and other relevant details:
Title | Company | Location | Description | Technical Skills | Matching Skills | Unmatching Skills | Match Percentage |
---|---|---|---|---|---|---|---|
Junior Web Developer (React JS) | HireMeFast LLC | Denver, CO | This is a remote position... | JavaScript, jQuery | computer science, JavaScript | jQuery | 75.00% |
Software Engineer | Microsoft | Mountain View, CA | The Azure Networking team is... | C++, Computer Science | C++, Computer Science | Azure | 71.43% |
Software Engineer I | Roche | Roche fosters diversity, equity... | Java, JavaScript, C++ | Computer Science, Java | distributed systems, AWS | 26.67% | |
Software Engineer - Recent Graduate | PayPal | At PayPal, we believe that every... | Java, GraphQL | Computer Science, Java | distributed systems, APIs | 33.33% | |
Software Engineer, New Grad | IXL Learning | IXL Learning is a leading EdTech... | Python, JavaScript | Python, JavaScript | Oracle, Scala | 75.00% | |
Software Engineer | ParetoHealth | ParetoHealth is committed to... | AWS, JavaScript | JavaScript | AWS, APIs, CI / CD | 14.29% | |
Software Engineer | Peoplr, LLC | Ann Arbor, MI | Are you an ambitious developer... | AWS, TypeScript, bash | PHP | AWS, TypeScript | 25.00% |
Software Engineer | Zip | Zip is tackling the $50B+ TAM... | Kubernetes, Python | Python, Computer Science | Kubernetes, Jira | 28.57% | |
Software Engineer | Microsoft | Redmond, WA | The Azure Redis Cache team... | C++, Computer Science | C++, Computer Science | Azure, Redis | 77.78% |
Software Engineer, Full Stack | Blend | Blend is a diverse team of problem... | Python, TypeScript | C++, Python | PostgreSQL, Angular | 54.55% | |
Software Engineer Intern | Vbrick | Herndon, VA | Vbrick enables organizations... | JavaScript, Agile | Computer Science | Agile, microservices | 66.67% |
Entry Level Software Engineer | Engtal | We are seeking a highly motivated... | C++, Python | C++, Python | 100.00% | ||
Software Engineer - Littleton, CO | Lockheed Martin | Littleton, CO | The coolest jobs on this planet... | Artificial Intelligence | Artificial Intelligence | 0.00% | |
Software Engineer I - Summer 2024 | Wayfair | Wayfair is on a mission to find... | Java, Computer Science | Computer Science, Java | Big data, algorithms | 33.33% | |
Software Engineer | Microsoft | Mountain View, CA | The Identity and Network Access... | C++, Computer Science | C++, Computer Science | Azure, Redis | 85.71% |
Front-End Software Engineer | Yahoo | United States | Yahoo Finance is the largest... | JavaScript, web development | Computer Science, JavaScript | Agile | 80.00% |
Software Engineer II | Bubble | New York, NY | Bubble.io is a unique platform... | AWS, TypeScript, Node.js | AWS, TypeScript, Node.js | 0.00% |
Results
Our LinkedIn scraper successfully collected over 500 job postings and provided a clear picture of the top skills in demand. These results were compared with the curriculum from Cincinnati State Technical and Community College, showing a 75% match between the skills taught and those required in the job market.