Python Awesome » Scrape on Feedspot

Scrape various open data directories to create an index of what's available out there

Python Awesome » Scrape

by John

1y ago

scrape-open-data Scrapes every available dataset from Socrata and stores them as newline-delimited JSON in this repository, to track changes over time through Git scraping. socrata/data.delaware.gov.jsonl contains the latest datasets for a specific domain. This is updated twice a day. socrata/data.delaware.gov.stats.jsonl contains information on page views and download numbers. This is updated once a week to avoid every single fetch including updated counts for many different datasets. scrape_socrata.py Run python scrape_socrata.py socrata/ to scrape the data from Socrata and save it in the ..read more

Visit website

NEW LOGGED BADGES SCREAPER

Python Awesome » Scrape

by John

1y ago

discord-badges-screaper Hello world, ? I made a badges finder that keeps a discord log that will be useful for you and I shared it right away. 5 line, user token ✨ 3 line, ID of the server you want to scan ? 2 line, chat channel of that server ⭐ 54 line, to discard the scanned webhook link ? GitHub View Github ..read more

Visit website

WebScrappy it's a easy to use library to do Web Scrapping a simple task

Python Awesome » Scrape

by Python awesome

2y ago

WebScrappy WebScrappy it’s a easy to use library to do Web Scrapping It’s not mature enough and i keep working for making it better For now, you can read the little tutorial! Download git clone https://github.com/zsendokame/webscrappy # Or pip install webscrappy Tutorial import WebScrappy import requests get = requests.get('http://example.com') findClass = WebScrappy.getClass(get.text, "The class you wan't to get!") print(findClass) # {'upbutton49': {'line': '49', 'class': 'upbutton', 'html': '<button onclick="action" id="upbutton">Go Up</button>', 'text': 'Text ..read more

Visit website

Scrapes Job website for python developer jobs and exports the data to a csv file

Python Awesome » Scrape

by Python Awesome

2y ago

WebScraping Web scraping Pyton program that scrapes Job website for python developer jobs and exports the data to a csv file. Requests – downloads the data from the web server (HTTP protocol) and saves the response . The response variable contains all the html data that can then be used to extract whatever information you need. Beautiful Soup library is used to parse the html data. Title, company name , location, salary and job summary are extracted to a python dictionary. Pandas is used to load the data into a dataframe and export to csv. GitHub View Github ..read more

Visit website

SkyScrapers: A collection of variety of Scraping Apps

Python Awesome » Scrape

by Python Awesome

2y ago

SkyScrapers Collection of variety of Web Scraping Apps The web-scrapers involved in this project are: StockSymbolScraper UnsplashImagesScraper Tech ScyScrapers uses a number of open source projects to work properly: BeautifulSoup – Beautiful Soup is a Python library for pulling data out of HTML and XML files. Selenium – Selenium is for automating web applications for testing purposes. ❗ ❗ ❗ Please check the markdown files named as THEORY inside respecetive project directories to get insight about the individual project. And of course SkyScrapers itself is open source with a public reposi ..read more

Visit website

Modules Woob pour l'intranet et autres sites Scouts et Guides de France

Python Awesome » Scrape

by Python Awesome

2y ago

Vis’Yerres SGDF – Modules Woob Vous avez le sentiment que l’intranet des Scouts et Guides de France ne convient pas à l’usage de votre groupe ? Vous réalisez une application s’orientant à l’usage unique de ce groupe, et souhaitez interagir avec l’intranet ? Vous avez toqué à la bonne porte ! Ce projet définit des modules Woob pour interagir avec l’intranet SGDF, ainsi qu’avec quelques autres ressources. Un exemple d’utilisation est le suivant : from visyerres_sgdf_woob import MODULES_PATH from woob.core.ouiboube import WebNip woob = WebNip(modules_path=MODULES_PATH) backend = woob.build_bac ..read more

Visit website

A python script scrapes accounts from public groups via Telegram API and saves them in a CSV file

Python Awesome » Scrape

by Python Awesome

2y ago

SimpleTelegramScraper – the best scraper on GitHub This simple python script scrapes accounts from public groups via Telegram API and saves them in a CSV file with their username, usersID, access hash, groupName, groupID and last seen online. You can choose to scrape all members, active members(users online today and yesterday), members active in the last week, past month or not active members. It can scrape more than 95% of users in a group! Bots are not included in the CSV file. The admins are also saved separately on admins.csv file. It can sometimes occur that towards the end a bug occurs ..read more

Visit website

A lightweight project that hourly scrapes lots of free-proxy sites, validates if it works, and serves a clean proxy list

Python Awesome » Scrape

by Python Awesome

2y ago

Free HTTP Proxy List ? It is a lightweight project that hourly scrapes lots of free-proxy sites, validates if it works, and serves a clean proxy list. Scraper found 1272 proxies at the latest update. Usable proxies are below. Usage Click the file format that you want and copy the URL. File Content Count data.txt ip_address:port combined (seperated new line) 153 data.json ip, port 153 data-with-geolocation.json ip, port, geolocation 153 Sources free-proxy-list.net us-proxy.org proxydb.net free-proxy-list.com proxy-list.download vpnoverview.com proxyscan.io proxylist.geonode.co ..read more

Visit website

A Python scraper for substrate chains

Python Awesome » Scrape

by Python Awesome

2y ago

subscrape A Python scraper for substrate chains that uses Subscan. Usage copy config/sample_scrape_config.json to config/scrape_config.json and configure to your desire. make sure there is a data/parachains folder run corresponding files will be created in data/ If a file already exists in data/, that operation will be skipped in subsequent runs. Architecture On overview is given in this Twitter thread: https://twitter.com/alice_und_bob/status/1493714489014956037 General We use the following methods in the projects: Logging: https://docs.python.org/3/howto/logging.html Async Operations: htt ..read more

Visit website

Web scraper build using python

Python Awesome » Scrape

by Python Awesome

2y ago

Web Scraper This project is made in pyhthon. It took some info. from website list then add them into data.json file. The dependencies used are: request bs4 Algorithm: The csv file opened, csv reader module is imported and file is opened. A header array is made and the header values are added. Row was accessed using loop, url oepned and values of country and asin are inserted according to the loop. By using BeautifulSoup we took the html code and got the raequired information, the informations are added in a json file (data.json) GitHub View Github ..read more

Visit website

Follow Python Awesome » Scrape on FeedSpot