<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Martijn's C# Programming Blog &#187; byte array</title>
	<atom:link href="http://www.dijksterhuis.org/tag/byte-array/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dijksterhuis.org</link>
	<description>Information, news about programming in C#</description>
	<lastBuildDate>Fri, 07 Aug 2009 21:26:47 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Encoding C# strings as Byte[] (Byte Arrays) and back again</title>
		<link>http://www.dijksterhuis.org/encoding-c-strings-as-byte-byte-arrays-and-back-again/</link>
		<comments>http://www.dijksterhuis.org/encoding-c-strings-as-byte-byte-arrays-and-back-again/#comments</comments>
		<pubDate>Tue, 25 Nov 2008 07:50:14 +0000</pubDate>
		<dc:creator>Martijn</dc:creator>
				<category><![CDATA[Beginner]]></category>
		<category><![CDATA[Learn C#]]></category>
		<category><![CDATA[byte]]></category>
		<category><![CDATA[byte array]]></category>
		<category><![CDATA[converting]]></category>
		<category><![CDATA[string]]></category>

		<guid isPermaLink="false">http://www.dijksterhuis.org/?p=265</guid>
		<description><![CDATA[
When working with io streams (such when sending and receiving information from a NetworkStream) you often have to convert C# strings into Byte[] (byte arrays) and back again. At this point it is important to consider how you would like to encode your string. This post shows how you can pass a string to a [...]<p>This is a post from <a href="http://www.dijksterhuis.org">Martijn's C# Coding Blog</a>. </p>
]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.dijksterhuis.org/wp-content/uploads/2008/11/byte.jpg" alt="" title="Encoding and Converting C# strings into Byte[] byte arrays " width="500" height="225" class="alignnone size-full wp-image-268" /></p>
<p><em>When working with io streams (such when sending and receiving information from a NetworkStream) you often have to convert C# strings into Byte[] (byte arrays) and back again. At this point it is important to consider how you would like to encode your string. This post shows how you can pass a string to a method that only accepts byte arrays &#8212; and how you can turn byte arrays back into strings again. </em><br />
<span id="more-265"></span></p>
<p>One major difference between C and C# is the fact that all strings are stored as Unicode.</p>
<p>In the old days when computers were still newish ASCII devised a standard for the first 128 characters, so a byte (which can hold up to 256 characters) was sufficient for communication. As time went by, and computers had to speak more languages the second half (128-255) was mapped to various languages. Many different encoding schemes (also called code pages) were designed, including ones that could hold Japanese &#038; Chinese (some 6000+ characters) while still fitting this information into just 256 available bytes.</p>
<p>It was however still impossible to write a single e-mail that contained Ancient Greek, Chinese and modern Russian. So work was started on the Unicode project. For Unicode it was decided that a 2 byte combination (65,536 values) was sufficient to hold all the worlds languages.</p>
<p>The basic unit of a memory cell, or a communication stream is still the byte. A function which sends or receives information thus has to work with Byte[] (byte arrays).</p>
<p><strong>Solution #1  &#8211; Convert Unicode to ASCII / String to an ASCII Byte[]</strong></p>
<p>If you intend to send only the most basic of messages which can be satisfied with just A-Z, a-z &#038; 0-9 and a few other characters you can convert the C# string using the ASCII encoder. You will however lose any characters that are not defined by ASCII. So while this is a good idea if your application is only used in North America, the rest of the world will probably not thank you for this design decision.</p>
<p><em>Convert a string to a byte[]</em></p>
<pre class="brush: c#">
// Native C# strings are unicode encoded
string StringMessage = &quot;Hello World How Are you? Pi \u03C0 Yen \uFFE5&quot;;

// We can show the characters on the command line
Console.WriteLine(&quot;{0}&quot;, StringMessage);

// We can convert directly a byte array, but some information is lost
System.Text.ASCIIEncoding ASCII  = new System.Text.ASCIIEncoding();
Byte[] BytesMessage = ASCII.GetBytes(StringMessage);
</pre>
<p><em>To convert a byte[] back into a string</em></p>
<pre class="brush: c#">
Byte[] BytesMessage; // Your message
System.Text.ASCIIEncoding ASCII  = new System.Text.ASCIIEncoding();
String StringMessage = ASCII.GetString( BytesMessage );
</pre>
<p><strong>Solution #2 &#8211; Convert the Unicode string to a Unicode ASCII representation / String to encoded byte[]</strong></p>
<p>These days a Western web browser can read Chinese pages, and send  and receive e-mails to and from anywhere. But as many existing systems (including e-mail!) still limit transmission to the ASCII set of characters a number of standards exist to encode the 16 bit Unicode strings into 7 or 8 bit communication. The most commonly used encoding method is UTF-8 which reliably combines Unicode into 8 bit data.</p>
<p><em>Convert a string to a UTF-8 encoded byte[]</em></p>
<pre class="brush: c#">
// Native C# strings are unicode encoded
string StringMessage = &quot;Hello World How Are you? Pi \u03C0 Yen \uFFE5&quot;;

// We can convert directly a byte array
System.Text.UTF8Encoding UTF8 = new System.Text.UTF8Encoding();
Byte[] BytesMessage = UTF8.GetBytes(StringMessage);
</pre>
<p><em>Convert a UTF-8 Byte Array back into a string</em></p>
<pre class="brush: c#">
Byte[] BytesMessage; // Your message
System.Text.UTF8Encoding UTF8 = new System.Text.UTF8Encoding();
String StringMessage = UTF8.GetString( BytesMessage );
</pre>
<p>As a side note: a UTF-8 encoded unicode character does not simply translate to 2 bytes. So the length of the created Byte[] is not simply 2 times the number of characters in the string.</p>
<p>In fact each Unicode character can possibly be encoded as 1 &#8211; 4 bytes. If you would like to know more about the encoding scheme, have a look at the <a rel="nofollow" href="http://en.wikipedia.org/wiki/Utf8">Wikipedia UTF-8 page</a>.
<ul>
<p><small>Image credit: <a rel="nofollow" href="http://www.flickr.com/photos/roland/">roland</a></small></p>
<p>This is a post from <a href="http://www.dijksterhuis.org">Martijn's C# Coding Blog</a>. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.dijksterhuis.org/encoding-c-strings-as-byte-byte-arrays-and-back-again/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
