VPS + Own homepage

HavocXphere

Honorary Master
Joined
Oct 19, 2007
Messages
33,153
Reaction score
1,297
Location
Europe
I'm thinking of grabbing a small VPS to host a personal homepage that scrapes together content of interest (weather, stock markets etc).

Bit fuzzy on how to make this work though - haven't done any web programming. I'm guessing I need something like:

Domain (already bought) > VPS > Debian > Cron w/ job scrapper > web server

Don't know what lang to use for the scrapper? I'm guessing python would be a good start since I recall seeing some scrapper libraries for that.

Ideas on how best to build the HTML (?) that the webserver serves? I'm guessing I'll need to piece it together more or less dynamically via code from what was scrapped or is there a better way?

EDIT: Somewhat unrelated...if I host something like a Teamspeak server or similar...what are the chances of that beating Skype quality wise?
 
Last edited:
http://php.net/manual/en/class.domdocument.php

If you are familiar using DOM, this will make things much easier.

You can just scrape the entire element containing the info you want and plot the outerHTML on your web page if you don't want the data to do anything.
 
Last edited:
I'm sure you could consume RSS feeds to get the data you want.

No need to write a web scraper
 
I'm sure you could consume RSS feeds to get the data you want.

No need to write a web scraper
Good point. Won't cover everything I need so def need some scrapping but maybe RSS is a a good baby steps starting point.

http://php.net/manual/en/class.domdocument.php

If you are familiar using DOM, this will make things much easier.

You can just scrape the entire element containing the info you want and plot the outerHTML on your web page if you don't want the data to do anything.
Not familiar with it but sounds useful. Quick google suggests its some kind of XML-y thing which makes sense.
 
Kind of sounds like you're describing RSS feeds to me... Why reinvent the wheel?
Where there are RSS feeds I'll use those. Stuff like the local weather service doesn't though...so I'll def need some scrapping one way or the other.

Gonna try & hotlink some of it straight through though...since its a personal website I'd imagine a little bit of hotlinking will fly under the radar
 
You may want to check the legal ramifications of page scraping.
 
You may want to check the legal ramifications of page scraping.
Not too worried tbh. This stuff is going to go on a page that will have a robot.txt to prevent indexing.

So not exactly starting a content farm here...
 
Top
Sign up to the MyBroadband newsletter
X