Android Basic JSOUP Tutorial

In this tutorial, you will learn how to implement JSOUP open source java library in your Android application. JSOUP provides a very convenient API for extracting and manipulating data, using DOM, CSS, and jquery-like methods. JSOUP allows you to scrape and parse HTML from a URL, file, or string and many more. We will create 3 buttons on the main view and each button will perform different tasks such as showing the website title, description and a logo. So lets begin…

Before you proceed with this tutorial, download the latest JSOUP library from here.

Paste your downloaded Jsoup file into your project libs folder as shown on the image below.

jsoup_libs

Create a new project in Eclipse File > New > Android Application Project. Fill in the details and name your project JsoupTutorial.

Application Name : JsoupTutorial

Project Name : JsoupTutorial

Package Name : com.androidbegin.jsouptutorial

Open your MainActivity.java and paste the following code.

MainActivity.java

package com.androidbegin.jsouptutorial;

import java.io.IOException;
import java.io.InputStream;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;

import android.os.AsyncTask;
import android.os.Bundle;
import android.app.Activity;
import android.app.ProgressDialog;
import android.graphics.Bitmap;
import android.graphics.BitmapFactory;
import android.view.View;
import android.view.View.OnClickListener;
import android.widget.Button;
import android.widget.ImageView;
import android.widget.TextView;

public class MainActivity extends Activity {

	// URL Address
	String url = "https://www.androidbegin.com";
	ProgressDialog mProgressDialog;

	@Override
	public void onCreate(Bundle savedInstanceState) {
		super.onCreate(savedInstanceState);
		setContentView(R.layout.activity_main);

		// Locate the Buttons in activity_main.xml
		Button titlebutton = (Button) findViewById(R.id.titlebutton);
		Button descbutton = (Button) findViewById(R.id.descbutton);
		Button logobutton = (Button) findViewById(R.id.logobutton);

		// Capture button click
		titlebutton.setOnClickListener(new OnClickListener() {
			public void onClick(View arg0) {
				// Execute Title AsyncTask
				new Title().execute();
			}
		});

		// Capture button click
		descbutton.setOnClickListener(new OnClickListener() {
			public void onClick(View arg0) {
				// Execute Description AsyncTask
				new Description().execute();
			}
		});

		// Capture button click
		logobutton.setOnClickListener(new OnClickListener() {
			public void onClick(View arg0) {
				// Execute Logo AsyncTask
				new Logo().execute();
			}
		});

	}

	// Title AsyncTask
	private class Title extends AsyncTask<Void, Void, Void> {
		String title;

		@Override
		protected void onPreExecute() {
			super.onPreExecute();
			mProgressDialog = new ProgressDialog(MainActivity.this);
			mProgressDialog.setTitle("Android Basic JSoup Tutorial");
			mProgressDialog.setMessage("Loading...");
			mProgressDialog.setIndeterminate(false);
			mProgressDialog.show();
		}

		@Override
		protected Void doInBackground(Void... params) {
			try {
				// Connect to the web site
				Document document = Jsoup.connect(url).get();
				// Get the html document title
				title = document.title();
			} catch (IOException e) {
				e.printStackTrace();
			}
			return null;
		}

		@Override
		protected void onPostExecute(Void result) {
			// Set title into TextView
			TextView txttitle = (TextView) findViewById(R.id.titletxt);
			txttitle.setText(title);
			mProgressDialog.dismiss();
		}
	}

	// Description AsyncTask
	private class Description extends AsyncTask<Void, Void, Void> {
		String desc;

		@Override
		protected void onPreExecute() {
			super.onPreExecute();
			mProgressDialog = new ProgressDialog(MainActivity.this);
			mProgressDialog.setTitle("Android Basic JSoup Tutorial");
			mProgressDialog.setMessage("Loading...");
			mProgressDialog.setIndeterminate(false);
			mProgressDialog.show();
		}

		@Override
		protected Void doInBackground(Void... params) {
			try {
				// Connect to the web site
				Document document = Jsoup.connect(url).get();
				// Using Elements to get the Meta data
				Elements description = document
						.select("meta[name=description]");
				// Locate the content attribute
				desc = description.attr("content");
			} catch (IOException e) {
				e.printStackTrace();
			}
			return null;
		}

		@Override
		protected void onPostExecute(Void result) {
			// Set description into TextView
			TextView txtdesc = (TextView) findViewById(R.id.desctxt);
			txtdesc.setText(desc);
			mProgressDialog.dismiss();
		}
	}

	// Logo AsyncTask
	private class Logo extends AsyncTask<Void, Void, Void> {
		Bitmap bitmap;

		@Override
		protected void onPreExecute() {
			super.onPreExecute();
			mProgressDialog = new ProgressDialog(MainActivity.this);
			mProgressDialog.setTitle("Android Basic JSoup Tutorial");
			mProgressDialog.setMessage("Loading...");
			mProgressDialog.setIndeterminate(false);
			mProgressDialog.show();
		}

		@Override
		protected Void doInBackground(Void... params) {

			try {
				// Connect to the web site
				Document document = Jsoup.connect(url).get();
				// Using Elements to get the class data
				Elements img = document.select("a[class=brand brand-image] img[src]");
				// Locate the src attribute
				String imgSrc = img.attr("src");
				// Download image from URL
				InputStream input = new java.net.URL(imgSrc).openStream();
				// Decode Bitmap
				bitmap = BitmapFactory.decodeStream(input);

			} catch (IOException e) {
				e.printStackTrace();
			}
			return null;
		}

		@Override
		protected void onPostExecute(Void result) {
			// Set downloaded image into ImageView
			ImageView logoimg = (ImageView) findViewById(R.id.logo);
			logoimg.setImageBitmap(bitmap);
			mProgressDialog.dismiss();
		}
	}
}

In this activity, we have created three buttons that response to three different AsyncTask. Before I proceed with further explanation, see the steps below on how to get the html source codes from a website.

Step 1 : Visit https://www.androidbegin.com with any preferred Internet browser on your PC

homepage

 

Step 2 : Right-Click on an open space and select “View page source

pagesource

 

Step 3 : Website source codes

source

 

A website source code determines how your webpages should appear. However, source code of a web page will only display information and code that is not processed by the server.

The first button retrieves the website title. This is a way to get the page title.

Java Code

@Override
		protected Void doInBackground(Void... params) {
			try {
				// Connect to the web site
				Document document = Jsoup.connect(url).get();
				// Get the html document title
				title = document.title();
			} catch (IOException e) {
				e.printStackTrace();
			}
			return null;
		}

Website Source Code

<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="description" content="Android Java Tutorials, Examples, Development" /><link rel="profile" href="http://gmpg.org/xfn/11" />
<link rel="pingback" href="https://www.androidbegin.com/xmlrpc.php" />

<!--[if lt IE 9]>
<script src="https://www.androidbegin.com/wp-content/themes/bliss/assets/js/html5.js" type="text/javascript"></script>
<![endif]-->

<script type="text/javascript">

  var _gaq = _gaq || [];
  _gaq.push(['_setAccount', 'UA-35207555-1']);
  _gaq.push(['_trackPageview']);

  (function() {
    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
  })();

</script>
<!-- This site is optimized with the Yoast WordPress SEO plugin v1.5.4.2 - https://yoast.com/wordpress/plugins/seo/ -->
<title>Android Java Tutorials, Examples, Guides, Development - AndroidBegin</title>
<meta name="description" content="Android Java tutorials, examples, guides and development for beginners. Learn Android programming with complete source code available for download."/>
<meta name="keywords" content="Android Tutorials, Android Samples, Android Guides, Android Tips, Android Apps, Android Games"/>

 

The second button retrieves the website description. By using Elements, we are able to specify the exact location of the data.

Java Code

@Override
		protected Void doInBackground(Void... params) {
			try {
				// Connect to the web site
				Document document = Jsoup.connect(url).get();
				// Using Elements to get the Meta data
				Elements description = document.select("meta[name=description]");
				// Locate the content attribute
				desc = description.attr("content");
			} catch (IOException e) {
				e.printStackTrace();
			}
			return null;
		}

Website Source Code 

<title>Android Java Tutorials, Examples, Guides, Development - AndroidBegin</title>
<meta name="description" content="Android Java tutorials, examples, guides and development for beginners. Learn Android programming with complete source code available for download."/>
<meta name="keywords" content="Android Tutorials, Android Samples, Android Guides, Android Tips, Android Apps, Android Games"/>

 

The third button retrieves the website logo. By using Elements, we are able to specify the exact location of the data.

Java Code

@Override
		protected Void doInBackground(Void... params) {

			try {
				// Connect to the web site
				Document document = Jsoup.connect(url).get();
				// Using Elements to get the class data
				Elements img = document.select("a[class=brand brand-image] img[src]");
				// Locate the src attribute
				String imgSrc = img.attr("src");
				// Download image from URL
				InputStream input = new java.net.URL(imgSrc).openStream();
				// Decode Bitmap
				bitmap = BitmapFactory.decodeStream(input);

			} catch (IOException e) {
				e.printStackTrace();
			}
			return null;
		}

Website Source Code

<div class="row-fluid top-banner">
			<div class="container">
				<div class="banner-overlay"></div>
									<a class="brand brand-image" href="https://www.androidbegin.com/" title="AndroidBegin" rel="home"><img src="https://www.androidbegin.com/wp-content/uploads/2013/08/Web-Logo364.png" alt="AndroidBegin"><h1></h1></a>
								<div class="top-banner-social pull-right" style="top:15px;">				</div>
			</div>
		</div>

 

Next, create an XML graphical layout for the MainActivity. Go to res > layout > Right Click on layout > New > Android XML File

Name your new XML file activity_main.xml and paste the following code.

activity_main.xml

<RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent" >

    <TextView
        android:id="@+id/titletxt"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:gravity="center" />

    <Button
        android:id="@+id/titlebutton"
        android:layout_width="fill_parent"
        android:layout_height="wrap_content"
        android:layout_below="@+id/titletxt"
        android:text="@string/Title" />

    <TextView
        android:id="@+id/desctxt"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_below="@+id/titlebutton"
        android:layout_centerInParent="true"
        android:gravity="center" />

    <Button
        android:id="@+id/descbutton"
        android:layout_width="fill_parent"
        android:layout_height="wrap_content"
        android:layout_below="@+id/desctxt"
        android:text="@string/Description" />

    <ImageView
        android:id="@+id/logo"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_below="@+id/descbutton"
        android:layout_centerInParent="true" />

    <Button
        android:id="@+id/logobutton"
        android:layout_width="fill_parent"
        android:layout_height="wrap_content"
        android:layout_below="@+id/logo"
        android:text="@string/Logo" />

</RelativeLayout>

Next, change the application name and texts. Open your strings.xml in your res > values folder and paste the following code.

strings.xml

<?xml version="1.0" encoding="utf-8"?>
<resources>

    <string name="app_name">Basic Jsoup Tutorial</string>
    <string name="action_settings">Settings</string>
    <string name="hello_world">Hello world!</string>
    <string name="Title">Website Title</string>
    <string name="Description">Website Description</string>
    <string name="Logo">Website Logo</string>

</resources>

In your AndroidManifest.xml, we need to declare permissions to allow the application to connect to the Internet. Open your AndroidManifest.xml and paste the following code.

AndroidManifest.xml

<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
    package="com.androidbegin.jsouptutorial"
    android:versionCode="1"
    android:versionName="1.0" >

    <uses-sdk
        android:minSdkVersion="8"
        android:targetSdkVersion="17" />

    <uses-permission android:name="android.permission.INTERNET" />

    <application
        android:allowBackup="true"
        android:icon="@drawable/ic_launcher"
        android:label="@string/app_name"
        android:theme="@style/AppTheme" >
        <activity
            android:name=".MainActivity"
            android:label="@string/app_name" >
            <intent-filter>
                <action android:name="android.intent.action.MAIN" />

                <category android:name="android.intent.category.LAUNCHER" />
            </intent-filter>
        </activity>
    </application>

</manifest>

Output:

BasicJsoupTutorial ScreenShots

Source Code

[purchase_link id=”8019″ text=”Purchase to Download Source Code” style=”button” color=”green”]

Latest comments

thank you so much!!

Ayush Jindal

Android Basic JSOUP Tutorial

I can't get bitmap to resolved I have put in import android.graphics.BitmapFactory; and import android.graphics.Bitmap; but bit map is still unresolved what am I doing wrong?

Christian Munch Hammervig

Android Basic JSOUP Tutorial

Because of my app navigation I use Fragments. Is it possible to use your tutorial in my situation?

om

Android Basic JSOUP Tutorial

hey there, i want to get news post from particular site which doesnt provide feeds over JSON/RSS.i have decided to use this method to scrape the contents,how can i scrape individual posts from the site an put it into list view?

Mohammed Ibn Abdullah

Android Basic JSOUP Tutorial