RenaissanceNow

Privacy and Duty

The last few months have been a tough time for internet security.

Recently, we’ve learned that a major computer manufacturer was found last week to be shipping laptops with malware and fake root certificates, compromising the secure communications of their customers.

We learned that hackers stole hundreds of millions of dollars from Russian banks.

And we learned that intelligence agencies may have hacked into a major SIM card manufacturer, putting the privacy of millions of people at risk.

Those of us in the IT world have a duty to respond to these incidents.

And I use the word duty very intentionally. Most system administrators have, by nature of their work, a moral, ethical, contractual and legal obligation to protect client and company data.

For example, if they work for a law firm, then the Canadian Bar Association Code of Professional Conduct includes this section:

Maintaining Information in Confidence

1. The lawyer has a duty to hold in strict confidence all information concerning the business and affairs of the client acquired in the course of the professional relationship, and shall not divulge any such information except as expressly or impliedly authorized by the client, required by law or otherwise required by this Code.

To ‘hold information in strict confidence’, must apply every bit as much to electronic records and communications as any other type of information.

If you work for a company with a presence in Europe, you are bound by EU data legislation, which includes:

“Everyone has the right to the protection of personal data.”
Under EU law, personal data can only be gathered legally under strict conditions, for a legitimate purpose. Furthermore, persons or organisations which collect and manage your personal information must protect it from misuse and must respect certain rights of the data owners which are guaranteed by EU law.

In my career, I’ve often found myself working with health care data, and thus come under the jurisdiction of Ontario’s Personal Health Information Protection Act, which among other things states:

12. (1) A health information custodian shall take steps that are reasonable in the circumstances to ensure that personal health information in the custodian’s custody or control is protected against theft, loss and unauthorized use or disclosure and to ensure that the records containing the information are protected against unauthorized copying, modification or disposal.

And anyone working in the financial industry is likely to find themselves subject to a Code of Ethics such as this one from TD bank:

A. Protecting Customer Information

Customer information must be kept private and confidential.”

C. Protecting TD Information We must carefully protect the confidential and proprietary information to which we have access, and not disclose it to anyone outside of TD or use it without proper authorization, and then only for the proper performance of our duties. “

Nothing to Hide?

Occasionally, I’ve heard the suggestion that ‘those with nothing to fear have nothing to hide.’

In the light of these duties and obligations, this claim is, of course, absurd. Not only do we in the IT industry have access and responsibility to large amounts of confidential information, we have a moral, ethical, contractual and legal obligation to keep it secure – to ‘hide’ it.

Because we can’t divine intent when our systems come under attack. Whether it’s a criminal gang, a careless vendor, or a foreign intelligence agency, the attack vectors are the same, and our response must be the same: robustly and diligently protecting the systems and data that have been placed in our care.

A Rough Week for Security

2014 was a tough year for anyone responsible for systems security. Heartbleed was uncovered in April, which led to some seriously panicky moments as we realised that some secure webservers had been accidentally leaking private information. And then again later in the year we discovered the Shellshock vulnerability in many Unix systems, leading to yet more sleepless nights as I and countless other systems administrators rushed to patch our systems.

trevor_neo I did find a couple of silver linings in these events, though. Firstly, both of the vulnerabilities, although severe, were the result of genuine mistakes on the part of well meaning, under-resourced developers, who didn’t anticipate the consequences of some of their design decisions. And secondly, I was intensely proud of how quickly the open source community rallied to provide diagnostic tools, patches, tests, and guides. With a speed and efficiency that I’ve never seen in a large company, a bunch of unpaid volunteers provided the tools we needed to dig ourselves out of the mess.

2015, however, is so far going worse. This week’s security flaws, specifically the ‘Superfish’ scandal (in which Lenovo deliberately sold laptops with a compromised root certificate purely so that third party software could inject ads into supposedly secure websites, and thus exposing millions of users to potential man-in-the-middle attacks), and the now-brewing ‘Privdog’ scandal (trust me, you’ll hear about this soon if you follow security blogs…), are the direct result of vendors choosing to violate the trust of consumers in the interests of chasing tiny increases in their profit margins.

I’m processing a number of emotions as I get up to speed on the implications of these security flaws. Firstly, frustration – any new security weakness causes more work for me as I test our systems, evaluate our vulnerabilities, apply necessary patches, and communicate with clients and colleagues.

Secondly, anger. I’m angry that vendors do not feel that they are bound by any particular obligation to provide their clients with the most secure systems possible, and that in both these cases they have deliberately violated protocols that have been developed over many years specifically to protect personal data from hackers, thieves, spies, corporate espionage, and other malicious actors. I don’t know whether their underlying motivation was greed, malice, or simply stupidity, but whatever the cause, I’m deeply, deeply disappointed. Not just with the companies, but with the specific individuals who chose to create flawed certificates, who chose to install them, who chose to bypass the very systems that we trust to keep us safe, and who chose to lie to consumers about it; telling them that this was ‘value added’ software, designed to ‘enhance their browsing experience’.

Thirdly, though, I’m grateful. We wouldn’t have even known about these flaws without the stirling work of security researchers such as Filippo Valsorda. Watching his twitter stream as the Superfish scandal unfolded was a surreal experience. As far as I can tell, the man neither eats nor sleeps, he just effortlessly creates software, documentation, vulnerability testing code, and informative tweets, with a speed that leaves me not so much envious as awestruck.

And finally, I’m left with a sense of determination. The whole world is connected now, and the Internet is every bit as critical to our global infrastructure as roads, shipping lanes, corporations, and governments. And it is a vital shared resource. If it is to continue to flourish, continue to allow us to communicate, learn, conduct business, share and collaborate, then it must remain a robust, trustable system. And although we have been sadly let down this week by systems vendors, the Internet is bigger than any one company. And our collective need and motivation for it to be a trustable system is greater than the shortsighted greed of any number of individuals.

So I’ll go back to work tomorrow, and I’ll do my best to keep my client’s data secure, their systems running, their information flowing, and I’ll do so grateful for all the work of millions of other hard working developers, systems administrators, hardware designers, and other assorted geeks.

Here’s to the crazy ones.

Bleeding Heartbeats

So, like systems administrators across the planet, I spent the day making sure that the various servers that I’m responsible for are not vulnerable to the “Heartbleed” bug. Now that it’s all over, I’m still quite shaken by the severity of this issue and its long term implications for the security of the internet.

There are already a number of good technical explanations of this bug, such as this one. The flaw can be boiled down to the following:

Many webservers use something called OpenSSL to encrypt communication between the browser and the website.
Recent versions of OpenSSL include a feature called heartbeat, which allows an end user to send a packet of data and the server sends it straight back.
However, if you lie to the server, it can get confused. If you tell it you’re sending a packet of, say, 1000 bytes, but don’t send it any, it will still send you back 1000 bytes of data. And that data will contain all sorts of random stuff that happens to be lying around in memory on the server.
Stuff such as user names, session ids, and query strings.

It was a very, very sobering experience to query a vulnerable webserver and see it return information from another user’s session that would be sufficient for me to impersonate them. This is not supposed to happen.

Really, the only good thing about this whole episode was watching how fast the sysadmin community reacted. There were excellent, in-depth technical discussions on Reddit, Hacker News and elsewhere. Several people wrote and published vulnerability testing software in a matter of hours, such as http://filippo.io/Heartbleed/ from Filippo Valsorda. It was inspiring to watch him build, refine, publish and support an incredibly useful tool overnight, and then scale it out using Amazon Web Services to support the sudden huge demand for the tool.

The longer term implications of all this are unclear. We’ve now had several worrying encryption bugs in the space of a few weeks, including Apple’s embarrassing GOTO fail.

Encryption is the bedrock of the modern internet, and yet we’re learning time and time again that writing good encryption software is hard.

Part of this must have to do with the tools that we use. C is an excellent language for systems level development, but was never designed to be used in a hostile environment. So many security breaches over the years have been caused by the way C requires the programmer to manually manage memory space. The promise of Open Source Software is that with many eyes checking source code, bugs will be found quickly. But the OpenSSL source code is, by all accounts, a mess. Just have a look at the actual lines of code responsible for the heartbleed error and ask yourself whether you would have found the vulnerability while performing a security review.

I’m not sure what all the answers are, but I have a strong feeling that a key response is to start writing more readable software.

Fundamentally, we need a more secure web. The internet is central to our lives, our societies, and our economies. Today I saw what a shaky foundation it all rests on.

‘top’ and friends – what’s going on on my system?

Linux gives you all sorts of ways of keeping an eye on what’s going on your server. Here are several that I find useful. Read my post on apt-get to get any of these that may not be installed on your computer.

The grand-daddy of them all is top

This is like Task Manager on Windows or Activity Monitor on Mac. It gives you a quick list of all the processes that are running on your computer, how much RAM and processor power they’re using, and how long they’ve been running for.

top, however has a pretty ugly output format, so it’s good that there are several better alternatives.

My preferred tool is htop, which gives a better visual display, including a nice bar-chart showing CPU usage on all your cores, and the ability to sort by CPU, memory usage etc.


  CPU[||                                                                          1.3%]     Tasks: 44, 46 thr; 1 running
  Mem[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||  332/987MB]     Load average: 0.00 0.01 0.05 
  Swp[|                                                                       0/1021MB]     Uptime: 1 day, 13:00:40

  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
24133 trevor     20   0 26420  2756  1456 R  0.7  0.3  0:00.57 htop
 1030 root       20   0 91728  4412  3572 S  0.0  0.4  2:21.86 /usr/sbin/vmtoolsd
 1051 postgres   20   0  129M  5056  3628 S  0.0  0.5  0:24.74 postgres: writer process

So what’s making my disk thrash?

htop tells us what processes are using memory and processing power. To see what’s using I/O, use iotop. This needs to be run as root:


$ sudo apt-get install iotop
$ sudo iotop

which gives you a list of which processes are reading and writing to disk.

But what’s using my network connection?

Likewise, we can monitor network usage using iftop


$ sudo apt-get install iftop
$ sudo iftop

which shows the open connections on your machine and how much data is being passed through them.

An alternative to iftop is nethogs which shows you which processes are responsible for data being passed over your network connections.

I also find netstat useful in these situations. netstat can do a lot, but one way I find myself using it a lot is

 $ netstat -plnt

which means ‘show me all the processes on this machine that are listening for TCP/IP connections, and which port they are listening on’. This is very helpful when you’re setting up a machine and you want to verify that, say, your webserver or database server is running correctly.

Linux gives you many, many ways to see what’s going on on your system, and we’ve only scratched the surface her. Check out the rest of my technology toolkit posts:

Four Killer Postgres Extensions

I’ve been using the Postgres database engine for probably 10 years now, and despite having also used Oracle, DB2, Microsoft SQL Server, SQLite, MySQL, Access, FoxPro and others, it’s still by far and away my favourite.

In all my years of using it, I have never once encountered, or even heard of, an incident of data loss or integrity failure. It implements the SQL standard rigourously. And whatever I throw at it, it seems to just keep humming along.

And more than that, it’s extensible. Here are four extensions that take it from merely being the most solid, reliable relational database engine in existence, to also being an incredible modern application development platform. And there easy to install and get started with.

UUID

Install this by simply typing in psql

# CREATE EXTENSION "uuid-ossp";

Since discovering this, I’ve said ‘goodbye’ to SERIAL primary keys forever. Now, my table definitions frequently look like this:

CREATE TABLE thing(
uuid UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
...
)

And there you go, items of type ‘thing’ will now be uniquely identifable GLOBALLY, as well as within his table. I can backup and restore the database without worrying about getting the serial generated correctly, I can merge data from two sources without worrying about conflicts, and I can use that UUID in other code, or in RESTFUL urls, and always be sure what I’m referring to.

pgcrypto

Once again, this can be installed as follows:

# CREATE EXTENSION pgcrypto;

and then you can finally get your password management right, and free of all the embarrassing security errors that have plagued some popular sites in recent years.

I recommend the following structure for a username/password table:

CREATE TABLE app_user(
uuid UUID PRIMARY KEY DEFAULT uuid_generate_v4(), -- just as before!
email text not null,
hashed_password text not null ,
UNIQUE (email)
)

And then populate the hashed_password column by using the handy crypt function from pgcrypto

INSERT INTO app_user (email, hashed_password) VALUES (:email, crypt(:password, gen_salt(‘bf’) ) ) RETURNING uuid;

Then you can check user credentials like this:

SELECT EXISTS (SELECT uuid FROM app_user WHERE email=:email AND hashed_password=crypt(:password,hashed_password));

This has any number of security advantages, and avoids many common pitfalls:

The algorithm has a tunable speed. By choosing a salt of type ‘bf’, the algorithm will be many thousands of times more resistant to brute-force attacks than once based on SHA1, and many many thousands more times resistant to one based on MD5 hashing. See here for more in-depth info.
By creating a new salt for each user, and embedding it in the output hash, the same username/password combination will not result in the same output hash. So even if a malicious attacker had access to this table, he wouldn’t be able to perform hash lookups in rainbow tables. It’s frightening that many sites don’t use this approach yet.
It’s been very thoroughly tested. Although I’ve written plenty of crypto code before, I’d always rather use a widely-tested, discussed and understood implementation. There are so many mistakes that are easy to make and hard to detect in the security world, using a solid open-source library like pgcrypo is just good practice

Hstore

# CREATE EXTENSION hstore;

Hstore allows you to store arbitary key/value pairs in a database column. This is perfect for storing property bags, and in situations where you don’t know at design time exactly what the structure of your data is going to be.

Let’s extend our user table:

CREATE TABLE app_user(
uuid UUID PRIMARY KEY DEFAULT uuid_generate_v4(), -- just as before!
email text not null,
hashed_password text not null,
UNIQUE (email),
properties hstore
)

Now we can assign arbitrary properties to each user:

UPDATE app_user SET properties =  ('twitter_id'=>'sir_tweetsalot', 'comments'=>'some notes here', 'follow_up'=>'1')

and then we can use that hstore field as follows:

SELECT * FROM app_user WHERE app_user.properties->'follow_up'

In fact, with UUID and HStore, Postgres is already looking like a pretty good NoSQL solution, but still with all the traditional SQL benefits of transactional integrity.

PLV8

And then finally, plv8. I’m only beginning to discover how powerful this is, and it really deserves a post of its own. In brief, PLV8 allows you to write stored procedures in Javascript, Coffeescript or LiveScript.

There are all sorts of things that you could do with this. Suffice it to say, last month we were pretty proud of ourselves when we wrote our own dialect of LISP, wrote a parser for it in Coffeescript, and then got the whole thing running inside our Postgres database. Yeah, we were using Lisp on top of Coffeescript to filter SQL records. That’s how we do things around here!

And all this is before Postgres formally gets JSON as a standard data type. I can’t wait!

New site design

I’ve tried to drag this site kicking and screaming into, well, whatever we’re calling this current decade. The goal is a clean, minimalist interface, easy to find articles, and a good user experience on any device.

I’ve also just learned far more about the innards of WordPress than I ever wanted to. Apparently, if you create a child theme, then you override existing PHP pages by creating new pages with the same name in your child theme folder. Unless, of course the php page is functions.php. In that case, both copies get included, causing all sorts of fun and conflicts.

In the end I gave up and edited the parent theme as well, so if I ever update the parent theme I’ll have to make a couple of changes.

Thank goodness for the Chrome WebInspector. I can’t imagine how we used to do web development before we had tools like that.

Nexus 7

I’m loving my new Nexus 7. I’m sitting in Casa Cappuccino drinking coffee and updating my blog. It’s a significant improvement over my old Playbook, for a number of reasons.

Firstly the on screen keyboard is better. A physical keyboard is always going to be more efficient than a touchscreen, but I’m finding that typing seems a lot more fluid on this than the playbook. I’m making less errors, the autocorrect is smarter, and editing existing text is easier. I’m not convinced by gesture typing yet, but with a bit of practice it could be quite handy.

Second, being part of the android ecosystem means there are way more available apps than on the playbook. The nexus 7 is a very nice portable movie viewer, for example.

Third, it’s very accessible as a development platform. It didn’t take long for me to go through the android tutorials and write, build and deploy my first application to the device.

On the downside, I miss the smooth integration with iTunes that I’m used to on iOS devices, and I’m frustrated that Greek fonts don’t render correctly. I love the youVersion bible application, and I wish I could switch easily between the English text and the Greek.

That said, this is definitely the nicest mobile device I’ve used so far, and is significantly cheaper than the iPad mini. Strongly recommended.

Online Development Tools

A couple of really neat tools caught my eye this week.

Try F# – F# is a functional language built on top of Microsoft’s CLR, drawing on languages like Haskell and O’Caml. Every language should have a site like this, it’s a great way to explore the syntax and start getting your head wrapped around the design philosophy.

SQLFiddle – A brilliant little tool that allows you to create a SQL database schema and execute queries against it directly in your browser. It supports Microsoft SQL Server, MySQL, Postgres and Oracle. It even gives you a nice query planner output to help you understand the cost of your query. If you choose Postgres, it provides you with a direct link to http://explain.depesz.com, a neat tool for visualising the output of the Postgres EXPLAIN command.

Postges provides some very powerful mechanisms for profiling and optimizing your queries, but they can be a little arcane to get started with. These tools could really help speed up your workflow.

SQLFiddle is clearly inspired by JSFiddle, a tool I’ve used for the last year or so for trying out short snippets of javascript and sharing them with others. It automatically gives you access to popular javascript libraries such as jQuery, and allows you to quickly prototype HTML, CSS, javascript and even Coffeescript with zero setup costs.

First Experiment with Gimp Paint Studio

And after an hour or two of experimenting, I came up with this:

A nice oil-painting feel, I think.

Gimp Paint Studio

How did I not know about Gimp Paint Studio before today? This is a stunning collection of brushes, textures and most importantly, tool presets.

This takes GIMP from merely being an incredibly powerful image editor to being an incredibly powerful painting tool, as well. I’ve played around with it for an hour, and I’m loving it already.