Android Basic JSOUP Tutorial

In this tutorial, you will learn how to implement JSOUP open source java library in your Android application. JSOUP provides a very convenient API for extracting and manipulating data, using DOM, CSS, and jquery-like methods. JSOUP allows you to scrape and parse HTML from a URL, file, or string and many more. We will create 3 buttons on the main view and each button will perform different tasks such as showing the website title, description and a logo. So lets begin…

Before you proceed with this tutorial, download the latest JSOUP library from here.

Paste your downloaded Jsoup file into your project libs folder as shown on the image below.

jsoup_libs

Create a new project in Eclipse File > New > Android Application Project. Fill in the details and name your project JsoupTutorial.

Application Name : JsoupTutorial

Project Name : JsoupTutorial

Package Name : com.androidbegin.jsouptutorial

Open your MainActivity.java and paste the following code.

MainActivity.java

package com.androidbegin.jsouptutorial;

import java.io.IOException;
import java.io.InputStream;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;

import android.os.AsyncTask;
import android.os.Bundle;
import android.app.Activity;
import android.app.ProgressDialog;
import android.graphics.Bitmap;
import android.graphics.BitmapFactory;
import android.view.View;
import android.view.View.OnClickListener;
import android.widget.Button;
import android.widget.ImageView;
import android.widget.TextView;

public class MainActivity extends Activity {

	// URL Address
	String url = "https://www.androidbegin.com";
	ProgressDialog mProgressDialog;

	@Override
	public void onCreate(Bundle savedInstanceState) {
		super.onCreate(savedInstanceState);
		setContentView(R.layout.activity_main);

		// Locate the Buttons in activity_main.xml
		Button titlebutton = (Button) findViewById(R.id.titlebutton);
		Button descbutton = (Button) findViewById(R.id.descbutton);
		Button logobutton = (Button) findViewById(R.id.logobutton);

		// Capture button click
		titlebutton.setOnClickListener(new OnClickListener() {
			public void onClick(View arg0) {
				// Execute Title AsyncTask
				new Title().execute();
			}
		});

		// Capture button click
		descbutton.setOnClickListener(new OnClickListener() {
			public void onClick(View arg0) {
				// Execute Description AsyncTask
				new Description().execute();
			}
		});

		// Capture button click
		logobutton.setOnClickListener(new OnClickListener() {
			public void onClick(View arg0) {
				// Execute Logo AsyncTask
				new Logo().execute();
			}
		});

	}

	// Title AsyncTask
	private class Title extends AsyncTask {
		String title;

		@Override
		protected void onPreExecute() {
			super.onPreExecute();
			mProgressDialog = new ProgressDialog(MainActivity.this);
			mProgressDialog.setTitle("Android Basic JSoup Tutorial");
			mProgressDialog.setMessage("Loading...");
			mProgressDialog.setIndeterminate(false);
			mProgressDialog.show();
		}

		@Override
		protected Void doInBackground(Void... params) {
			try {
				// Connect to the web site
				Document document = Jsoup.connect(url).get();
				// Get the html document title
				title = document.title();
			} catch (IOException e) {
				e.printStackTrace();
			}
			return null;
		}

		@Override
		protected void onPostExecute(Void result) {
			// Set title into TextView
			TextView txttitle = (TextView) findViewById(R.id.titletxt);
			txttitle.setText(title);
			mProgressDialog.dismiss();
		}
	}

	// Description AsyncTask
	private class Description extends AsyncTask {
		String desc;

		@Override
		protected void onPreExecute() {
			super.onPreExecute();
			mProgressDialog = new ProgressDialog(MainActivity.this);
			mProgressDialog.setTitle("Android Basic JSoup Tutorial");
			mProgressDialog.setMessage("Loading...");
			mProgressDialog.setIndeterminate(false);
			mProgressDialog.show();
		}

		@Override
		protected Void doInBackground(Void... params) {
			try {
				// Connect to the web site
				Document document = Jsoup.connect(url).get();
				// Using Elements to get the Meta data
				Elements description = document
						.select("meta[name=description]");
				// Locate the content attribute
				desc = description.attr("content");
			} catch (IOException e) {
				e.printStackTrace();
			}
			return null;
		}

		@Override
		protected void onPostExecute(Void result) {
			// Set description into TextView
			TextView txtdesc = (TextView) findViewById(R.id.desctxt);
			txtdesc.setText(desc);
			mProgressDialog.dismiss();
		}
	}

	// Logo AsyncTask
	private class Logo extends AsyncTask {
		Bitmap bitmap;

		@Override
		protected void onPreExecute() {
			super.onPreExecute();
			mProgressDialog = new ProgressDialog(MainActivity.this);
			mProgressDialog.setTitle("Android Basic JSoup Tutorial");
			mProgressDialog.setMessage("Loading...");
			mProgressDialog.setIndeterminate(false);
			mProgressDialog.show();
		}

		@Override
		protected Void doInBackground(Void... params) {

			try {
				// Connect to the web site
				Document document = Jsoup.connect(url).get();
				// Using Elements to get the class data
				Elements img = document.select("a[class=brand brand-image] img[src]");
				// Locate the src attribute
				String imgSrc = img.attr("src");
				// Download image from URL
				InputStream input = new java.net.URL(imgSrc).openStream();
				// Decode Bitmap
				bitmap = BitmapFactory.decodeStream(input);

			} catch (IOException e) {
				e.printStackTrace();
			}
			return null;
		}

		@Override
		protected void onPostExecute(Void result) {
			// Set downloaded image into ImageView
			ImageView logoimg = (ImageView) findViewById(R.id.logo);
			logoimg.setImageBitmap(bitmap);
			mProgressDialog.dismiss();
		}
	}
}

In this activity, we have created three buttons that response to three different AsyncTask. Before I proceed with further explanation, see the steps below on how to get the html source codes from a website.

Step 1 : Visit https://www.androidbegin.com with any preferred Internet browser on your PC

homepage

 

Step 2 : Right-Click on an open space and select “View page source

pagesource

 

Step 3 : Website source codes

source

 

A website source code determines how your webpages should appear. However, source code of a web page will only display information and code that is not processed by the server.

The first button retrieves the website title. This is a way to get the page title.

Java Code

@Override
		protected Void doInBackground(Void... params) {
			try {
				// Connect to the web site
				Document document = Jsoup.connect(url).get();
				// Get the html document title
				title = document.title();
			} catch (IOException e) {
				e.printStackTrace();
			}
			return null;
		}

Website Source Code











Android Java Tutorials, Examples, Guides, Development - AndroidBegin


 

The second button retrieves the website description. By using Elements, we are able to specify the exact location of the data.

Java Code

@Override
		protected Void doInBackground(Void... params) {
			try {
				// Connect to the web site
				Document document = Jsoup.connect(url).get();
				// Using Elements to get the Meta data
				Elements description = document.select("meta[name=description]");
				// Locate the content attribute
				desc = description.attr("content");
			} catch (IOException e) {
				e.printStackTrace();
			}
			return null;
		}

Website Source Code 

Android Java Tutorials, Examples, Guides, Development - AndroidBegin


 

The third button retrieves the website logo. By using Elements, we are able to specify the exact location of the data.

Java Code

@Override
		protected Void doInBackground(Void... params) {

			try {
				// Connect to the web site
				Document document = Jsoup.connect(url).get();
				// Using Elements to get the class data
				Elements img = document.select("a[class=brand brand-image] img[src]");
				// Locate the src attribute
				String imgSrc = img.attr("src");
				// Download image from URL
				InputStream input = new java.net.URL(imgSrc).openStream();
				// Decode Bitmap
				bitmap = BitmapFactory.decodeStream(input);

			} catch (IOException e) {
				e.printStackTrace();
			}
			return null;
		}

Website Source Code

AndroidBegin

 

Next, create an XML graphical layout for the MainActivity. Go to res > layout > Right Click on layout > New > Android XML File

Name your new XML file activity_main.xml and paste the following code.

activity_main.xml



    

    

Next, change the application name and texts. Open your strings.xml in your res > values folder and paste the following code.

strings.xml




    Basic Jsoup Tutorial
    Settings
    Hello world!
    Website Title
    Website Description
    Website Logo

In your AndroidManifest.xml, we need to declare permissions to allow the application to connect to the Internet. Open your AndroidManifest.xml and paste the following code.

AndroidManifest.xml




    

    

    
        
            
                

                
            
        
    

Output:

BasicJsoupTutorial ScreenShots

Source Code

[purchase_link id=”8019″ text=”Purchase to Download Source Code” style=”button” color=”green”]

Latest comments

thank you so much!!

Ayush Jindal

Android Basic JSOUP Tutorial

I can't get bitmap to resolved I have put in import android.graphics.BitmapFactory; and import android.graphics.Bitmap; but bit map is still unresolved what am I doing wrong?

Christian Munch Hammervig

Android Basic JSOUP Tutorial

Because of my app navigation I use Fragments. Is it possible to use your tutorial in my situation?

om

Android Basic JSOUP Tutorial

hey there,i want to get news post from particular site which doesnt provide feeds over JSON/RSS.i have decided to use this method to scrape the contents,how can i scrape individual posts from the site an put it into list view?

Mohammed Ibn Abdullah

Android Basic JSOUP Tutorial