Forum Navigation
tango.net.uri question
Posted: 02/17/09 00:16:05im writing a crawler bot for a search engine project and now having a problem with tango.net lack of documentation, can anybody tell me how:
1. to obtain path only (so i can filter it with robots.txt), im using tango.net.uri and all i can get was getPath() like '/pages/gallery.html', how to get '/pages/' only so ill not mess up with starting url like 'http://www.wikipedia.org' with no filename.
2. im using tango.net.http.HttpClient? sample as a base to retrieve html pages, is there a way to send my crawler user-agent when fetching a pages?
thanks.