How to save the data in a url location to a file using java

This is pretty much simple and straight forward. But sometimes ppl tend to mess it up. Here’s how it is done with a minimum number of codes as I know.

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;

    public void saveUrlToFile(File saveFile,String location){
        URL url;
        try {
            url = new URL(location);
            BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
            BufferedWriter out = new BufferedWriter(new FileWriter(saveFile));
            char[] cbuf=new char[255];
            while ((in.read(cbuf)) != -1) {
                out.write(cbuf);
            }
            in.close();
            out.close();

        } catch (MalformedURLException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }

}

You can use this to download somefile from the net manually if you want to. But this cannot be used as a download accelerator since this uses only one thread to download stuff.

Tags: , , , ,

4 Responses to “How to save the data in a url location to a file using java”

  1. Mat Says:

    You always should put your “out.close()” and “in.close()” in a finally block.

  2. sameer Says:

    this method fails when you have special formats
    it only retrieves binary formats
    what if it’s a custom indexed file

  3. Murthy Upadhyayula Says:

    This will just copy the html code….it does not resolve the relative links….
    If there are any js or css files associated with the webpage , then they are not downloaded….If we open the webpage the content may look misaligned as the required css files are missing…

    • samindaw Says:

      Yes. It is what a web browser also does when it requests for a page from a url. In addition the browser will parse (scan) the page retrieved and determine if there are further resources that needs to be retrieved.
      If you want such behavior then you also have to write a parser for the retrieved content and determine what are resources needs to be retrieved. But its not the purpose of this post. The purpose is just to save the content of a url. parsing is a completely different branch.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: