Handling encodings has always been tricky. Fortunately, UTF-8
has become the dominant encoding for all things Internet, so if you stick with it things will likely just work and you can focus on other more important things.
But recently I found myself calling an API where the response was encoded in windows-1252
rather than UTF-8
, and things start to break when characters show up as � in my Angular app.
Of course, I did what any respectable developer would do when faced with this problem — I googled. To my surprise, after sifting through a whole lot of Stack Overflow/Medium/various pages, I didn’t find anything that spelled out clearly a way to decode the response to UTF-8
.
There were pages that suggested using iconv-lite or similar packages to do the decoding. This does not work because these packages usually depend on the Buffer
Nodejs type, which is not available when you’re an app running in the browser — the first tell-tale sign something was amiss was the TypeScript compiler errors about the unknown Buffer
type. Of course, there were a whole lot of other pages that will tell you to add Buffer
to tsconfig.app.json
to “fix” the errors; but no, your app will compile but at runtime it will fail to find Buffer
.
Other pages detailed methods to do the decoding yourself using a variety of loops, mappings, and/or deprecated functions like escape/unescape
. While these solutions might work I didn’t find them particularly elegant (and shouldn’t there be packages that wrap these methods so they are easy to consume?)
In the end, the solution was actually simple and I’m not sure why it took me that long to find and piece the parts together. It turns out all I needed to use was the TextDecoder
interface, which takes in an ArrayBuffer
and returns a decoded string. Paired with Angular’s HttpClient
, here’s what the code looks like:
As you can see, the code is trivial. If you’re reading this post, I hope it is helpful and saves you from an hour or so of googling.