Convert String to Byte Array Java Program

Convert String to Byte Array Java Program: Welcome to our tutorial on converting between a String and byte array in Java. In this guide, we’ll explore various methods for performing these operations. First, we’ll look at converting a String into a byte array. Then, we’ll examine how to do the reverse process – converting a byte array back into a String.

Converting a String to Byte Array

A String is stored as an array of Unicode characters in Java. To convert it to a byte array, we need to translate the sequence of characters into a series of bytes. For this translation, we use something called an instance of Charset. This class specifies a mapping between a sequence of chars and a sequence of bytes – we refer to this process as encoding.

Convert String to Byte Array Java Program 1
Post On:Convert String to Byte Array Java
Post Type:Java Tutorials
Published On:www.softwaretestingo.com
Applicable For:Freshers & Experience
Get Updates:SoftwareTestingo Telegram Group

In Java, the String class provides three overloaded getBytes methods to encode a string into a byte array. Let’s look at each of them in detail with examples. These methods are:

  • getBytes() – encodes using the platform’s default charset
  • getBytes(Charset charset) – encodes using the named charset
  • getBytes(String charsetName) – encodes using the provided charset

If you’re one of the many developers who use Java’s getBytes() method without specifying a character encoding, be aware that you’re taking a gamble that could cause your code to break.

Although it may be convenient to use Java’s getBytes() method to convert Strings to byte arrays, many developers don’t do so correctly. Almost 70% of the code reviewed uses getBytes() without character encoding, which leaves the possibility that the platform’s default character encoding will influence the results.

This article shows that you should always use the getBytes() method with an explicit character encoding. StandardCharset class has a standard set of character encodings supported out-of-box by Java. We will review them as well.

It’s also good practice to specify the character encoding in your code using one of the pre-defined constants instead of a free text or String to avoid typos and other silly mistakes.

String to a byte array using getBytes()

You can use Java’s built-in .getBytes() method to convert a string into a byte array. This is the most common way to do it, but remember that it might produce an erroneous result if the platform’s character encoding doesn’t match your expectations.

package com.SoftwareTestingO.Array;
import java.util.Arrays;
public class StringToByteArray 
{
	public static void main(String[] args) 
	{
		byte[] ascii = "SoftwareTestingo".getBytes(); 
		System.out.println("platform's default character encoding : " + System.getProperty("file.encoding")); 
		System.out.println("length of byte array in default encoding : " + ascii.length); 
		System.out.println("contents of byte array in default encoding: " + Arrays.toString(ascii));
	}
}

Remarks:

  • Java will use the platform’s default encoding to convert a character to bytes if you don’t specify any character encoding.
  • If you’re wondering what your machine’s default character encoding is, you can use System.getProperty(“file.encoding”).
  • If you see different results in your code running in different environments, it may be because of the default character encoding difference. Remember this: Don’t rely on the defaults to stay consistent between QA and production.
  • The length of the byte array may not be the same as that of the String. This is because it depends on character encoding. Some character encodings are multi-byte, but only 1 byte is usually needed to encode ASCII characters.

String to byte array using getBytes(“encoding”)

You can specify the encoding if you want to convert a string to a byte array, so there are no guesswork or platform defaults.

package com.SoftwareTestingO.Array;
import java.io.UnsupportedEncodingException;
import java.util.Arrays;
public class StringToByteArray1 
{
	public static void main(String[] args) 
	{
		try 
		{ 
			byte[] utf16 = "SoftwareTestingo".getBytes("UTF-16"); 
			System.out.println("length of byte array in UTF-16 charater encoding : " + utf16.length); 
			System.out.println("contents of byte array in UTF-16 encoding: " + Arrays.toString(utf16)); 
		} 
		catch (UnsupportedEncodingException e) 
		{ 
			e.printStackTrace(); 
		}
	}
}

Remarks:

  • It’s an improvement on the previous approach, but it throws a checked exception (java.io.UnsupportedEncodingException) if the character encoding String has a typo or specifies a character encoding not supported by Java.
  • The byte array that is returned will be in the specified character encoding.
  • It’s worth noting that the length of a byte array is not necessarily the same as the number of characters in a String. UTF-16 encoding often uses more than one byte to represent a character.

Root Cause For UnsupportedEncodingException

If you’re getting an error with the String.getBytes method, you’re likely using the incorrect encoding format or one that Java doesn’t support. An ordinary character encoding scheme is UTF-8. You can find the supporting encoding formats here.

How to reproduce this issue

If you see the error “java.io.UnsupportedEncodingException,” the encoding scheme is invalid, and Java cannot interpret the String to bytes. To fix this, make sure to use a valid encoding scheme name.

package com.SoftwareTestingO.Array;
import java.io.UnsupportedEncodingException;
import java.util.Arrays;
public class StringToByteArray1 
{
	public static void main(String[] args) 
	{
		try 
		{ 
			byte[] utf16 = "SoftwareTestingo".getBytes("UTF"); 
			System.out.println("length of byte array in UTF-16 charater encoding : " + utf16.length); 
			System.out.println("contents of byte array in UTF-16 encoding: " + Arrays.toString(utf16)); 
		} 
		catch (UnsupportedEncodingException e) 
		{ 
			e.printStackTrace(); 
		}
	}
}

Solution

You should provide a java supported encoding scheme name in the String.getBytes() method. When you need more control over the encoding process, use the CharsetEncoder class.

String to a byte array using getBytes(Charset)

Assuming you want to convert a string to a byte array, this is the third but probably the best way to do so in Java. In this example, I have used Java.nio.StandardCharsets to specify character encoding. This class contains widely used character encoding constants like UTF-8, UTF-16, etc.

If you’re using JDK 7 or later, this approach is a good option because it doesn’t throw checked java.io.UnsupportedEncodingException. However, if you’re using an earlier version of Java, this class might not be available to you.

package com.SoftwareTestingO.Array;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;
public class StringToByteArray2 
{
	public static void main(String[] args) 
	{
		byte[] utf8 = "abcdefgh".getBytes(StandardCharsets.UTF_8); 
		System.out.println("length of byte array in UTF-8 : " + utf8.length); 
		System.out.println("contents of byte array in UTF-8: " + Arrays.toString(utf8));
	}
}

Remarks:

  • It is the best way to convert a string to a byte array in Java.
  • There is no need for boilerplate code to handle the checked exception of java.io.UnsupportedEncodingException.
  • You have to use Java 7 or later to be able to use the StandarhardCasets class.

Using Charset.encode()

You can use the Class Charset’s encode() method to convert a string into bytes. To get an array of bytes from the resulting encoded string, call the array() method.

If you look at the program below, you’ll see that it takes a string as input. However, if that string contains characters not part of the ASCII character set, they’ll be mapped to the default replacement character.

package com.SoftwareTestingO.Array;
import java.nio.ByteBuffer;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
public class StringToByteArray3 
{
	public static void main(String[] args) 
	{
		// creating string
		String inputStr = "SoftwareTestingo ब्लॉग";

		// create charset object
		Charset charset = StandardCharsets.US_ASCII;

		// encoding string
		ByteBuffer buffer = charset.encode(inputStr);

		// ByteBuffer to byte[] array
		byte[] array = buffer.array();

		// printing byte array
		for (int j = 0; j < array.length; j++) 
		{
			System.out.print(" " + array[j]);
		}
	}
}

CharsetEncoder

If you need more flexibility when working with unknown mappings, the CharsetEncoder class provides methods that can help. The onUnmappableCharacter() and replaceWith() methods can work with the default unknown mappings.

package com.SoftwareTestingO.Array;
import java.nio.CharBuffer;
import java.nio.charset.CharacterCodingException;
import java.nio.charset.CharsetEncoder;
import java.nio.charset.CodingErrorAction;
import java.nio.charset.StandardCharsets;
public class StringToByteArray4 
{
	public static void main(String[] args) throws CharacterCodingException 
	{
		// creating string
		String inputStr = "SoftwareTestingo ब्लॉग";

		CharsetEncoder encoder = StandardCharsets.US_ASCII.newEncoder();

		// Replacing the unmapping charset with zero's
		encoder.onMalformedInput(CodingErrorAction.IGNORE).
		onUnmappableCharacter(CodingErrorAction.REPLACE)
		.replaceWith(new byte[] { 0 });

		byte[] array = encoder.encode(CharBuffer.wrap(inputStr)).array();

		// printing byte array
		for (int j = 0; j < array.length; j++) {
			System.out.print(" " + array[j]);
		}
	}
}

Use the following tips to create an instance of CharsetEncoder:

  • Call the newEncoder method on a CharsetEncoderobject.
  • Specify actions for error conditions by calling the onMalformedInput() and onUnmappableCharacter() methods.

We can specify the following actions:

  • IGNORE – drop the erroneous input
  • REPLACE – replace the erroneous input
  • REPORT – report the error by returning a CoderResult object or throwing a CharacterCodingException

You have now completed a review of the different ways to convert a String into a byte array. Next, let’s look at how to reverse this operation.

Converting a Byte Array to a String

We can convert a byte array to a string using the decode method of the Charset class. This process is known as decoding. Just like encoding, we need to specify which charset to use.

Convert String to Byte Array Java Program 1

If we want to decode a byte array back into a String, however, we need to use the same charset originally used to encode the string into a byte array. Otherwise, we risk not being able to decode the string properly.

Let’s try understanding how to convert a byte array to a String with examples and programs.

Using String Constructor

The String class has a few constructors that input a byte array. They work in reverse of the getBytes method but are just as simple. So, let’s convert a byte array to String using the platform’s default charset!

package com.SoftwareTestingO.Array;
import java.util.Arrays;
public class StringToByteArray01 
{
	public static void main(String[] args) 
	{
		byte[] byteArray = "SoftwareTestingo".getBytes(); 
		
		// Convert String to Byte Array
		System.out.println("contents of byte array in default encoding: " + Arrays.toString(byteArray));
		
		//Converting Byte Array to String
		String decode = new String(byteArray);
		System.out.println("Decoded string : " + decode);
	}
}

When you pass the name of a character set to the String constructor, it uses that charset for encoding. For example, if we want to use a different charset in our encoding process, we would do something like this:

package com.SoftwareTestingO.Array;
import java.io.UnsupportedEncodingException;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;

public class StringToByteArray02 
{
	public static void main(String[] args) throws UnsupportedEncodingException
	{
		Charset charset = StandardCharsets.UTF_8;
		byte[] byteArray = {83, 111, 102, 116, 119, 97, 114, 101, 84, 101, 115, 116, 105, 110, 103, 111}; 
		
		// Convert String to Byte Array
		System.out.println("contents of byte array in default encoding: " + Arrays.toString(byteArray));
		
		//Converting Byte Array to String
		String decode = new String(byteArray, charset);
		System.out.println("Decoded string : " + decode);
	}
}

Using Charset.decode()

The Charset class has a decode() method that converts ByteBuffers to Strings. The example below shows how to use the Charset.decode() method.

Note: Invalid characters will be replaced with the default replacement character.

package com.SoftwareTestingO.Array;
import java.io.UnsupportedEncodingException;
import java.nio.ByteBuffer;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
public class StringToByteArray03 
{
	public static void main(String[] args) throws UnsupportedEncodingException
	{
		// creating byte array
		byte[] byteArray = {83,111,102,116,119,97,114,101,84,101,115,116,105,110,103,111,32,63,63,63,63,63};
		
		// create charset object
		Charset charset = StandardCharsets.US_ASCII;
		ByteBuffer byteBuffer = ByteBuffer.wrap(byteArray);
		
		// decoding string
		String output = charset.decode(byteBuffer).toString();
		System.out.println("output  : " + output);
	}
}

Using CharsetDecoder

If you need control over the decoding process, CharsetDecoder is a good option. It gives you the flexibility to replace illegal characters with desired letters.

package com.SoftwareTestingO.Array;
import java.io.UnsupportedEncodingException;
import java.nio.ByteBuffer;
import java.nio.charset.CharacterCodingException;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CodingErrorAction;
import java.nio.charset.StandardCharsets;
public class StringToByteArray04 
{
	public static void main(String[] args) throws UnsupportedEncodingException, CharacterCodingException
	{
		// creating byte array
		byte[] byteArray = {83,111,102,116,119,97,114,101,84,101,115,116,105,110,103,111,32,63,63,63,63,63};

		CharsetDecoder decoder = StandardCharsets.US_ASCII.newDecoder();

		// Replacing the unmapping charset with zero's
		decoder.onMalformedInput(CodingErrorAction.IGNORE)
		.onUnmappableCharacter(CodingErrorAction.REPLACE)
		.replaceWith("?");

		ByteBuffer byteBuffer = ByteBuffer.wrap(byteArray);
		String finalDecodedStr = decoder.decode(byteBuffer).toString();

		System.out.println("Output : " + finalDecodedStr);
	}
}

Conclusion:

After going through this article, we hope you understand how to convert a String to a Byte array in Java using the getBytes method. But still, if you are facing any problems, comment in the section, and we will try to help you resolve the issue.

Avatar for Softwaretestingo Editorial Board

I love open-source technologies and am very passionate about software development. I like to share my knowledge with others, especially on technology that's why I have given all the examples as simple as possible to understand for beginners. All the code posted on my blog is developed, compiled, and tested in my development environment. If you find any mistakes or bugs, Please drop an email to softwaretestingo.com@gmail.com, or You can join me on Linkedin.

Leave a Comment