RenaissanceNow

Privacy and Duty

The last few months have been a tough time for internet security.

Recently, we’ve learned that a major computer manufacturer was found last week to be shipping laptops with malware and fake root certificates, compromising the secure communications of their customers.

We learned that hackers stole hundreds of millions of dollars from Russian banks.

And we learned that intelligence agencies may have hacked into a major SIM card manufacturer, putting the privacy of millions of people at risk.

Those of us in the IT world have a duty to respond to these incidents.

And I use the word duty very intentionally. Most system administrators have, by nature of their work, a moral, ethical, contractual and legal obligation to protect client and company data.

For example, if they work for a law firm, then the Canadian Bar Association Code of Professional Conduct includes this section:

Maintaining Information in Confidence

1. The lawyer has a duty to hold in strict confidence all information concerning the business and affairs of the client acquired in the course of the professional relationship, and shall not divulge any such information except as expressly or impliedly authorized by the client, required by law or otherwise required by this Code.

To ‘hold information in strict confidence’, must apply every bit as much to electronic records and communications as any other type of information.

If you work for a company with a presence in Europe, you are bound by EU data legislation, which includes:

“Everyone has the right to the protection of personal data.”
Under EU law, personal data can only be gathered legally under strict conditions, for a legitimate purpose. Furthermore, persons or organisations which collect and manage your personal information must protect it from misuse and must respect certain rights of the data owners which are guaranteed by EU law.

In my career, I’ve often found myself working with health care data, and thus come under the jurisdiction of Ontario’s Personal Health Information Protection Act, which among other things states:

12. (1) A health information custodian shall take steps that are reasonable in the circumstances to ensure that personal health information in the custodian’s custody or control is protected against theft, loss and unauthorized use or disclosure and to ensure that the records containing the information are protected against unauthorized copying, modification or disposal.

And anyone working in the financial industry is likely to find themselves subject to a Code of Ethics such as this one from TD bank:

A. Protecting Customer Information

Customer information must be kept private and confidential.”

C. Protecting TD Information We must carefully protect the confidential and proprietary information to which we have access, and not disclose it to anyone outside of TD or use it without proper authorization, and then only for the proper performance of our duties. “

Nothing to Hide?

Occasionally, I’ve heard the suggestion that ‘those with nothing to fear have nothing to hide.’

In the light of these duties and obligations, this claim is, of course, absurd. Not only do we in the IT industry have access and responsibility to large amounts of confidential information, we have a moral, ethical, contractual and legal obligation to keep it secure – to ‘hide’ it.

Because we can’t divine intent when our systems come under attack. Whether it’s a criminal gang, a careless vendor, or a foreign intelligence agency, the attack vectors are the same, and our response must be the same: robustly and diligently protecting the systems and data that have been placed in our care.

A Rough Week for Security

2014 was a tough year for anyone responsible for systems security. Heartbleed was uncovered in April, which led to some seriously panicky moments as we realised that some secure webservers had been accidentally leaking private information. And then again later in the year we discovered the Shellshock vulnerability in many Unix systems, leading to yet more sleepless nights as I and countless other systems administrators rushed to patch our systems.

trevor_neo I did find a couple of silver linings in these events, though. Firstly, both of the vulnerabilities, although severe, were the result of genuine mistakes on the part of well meaning, under-resourced developers, who didn’t anticipate the consequences of some of their design decisions. And secondly, I was intensely proud of how quickly the open source community rallied to provide diagnostic tools, patches, tests, and guides. With a speed and efficiency that I’ve never seen in a large company, a bunch of unpaid volunteers provided the tools we needed to dig ourselves out of the mess.

2015, however, is so far going worse. This week’s security flaws, specifically the ‘Superfish’ scandal (in which Lenovo deliberately sold laptops with a compromised root certificate purely so that third party software could inject ads into supposedly secure websites, and thus exposing millions of users to potential man-in-the-middle attacks), and the now-brewing ‘Privdog’ scandal (trust me, you’ll hear about this soon if you follow security blogs…), are the direct result of vendors choosing to violate the trust of consumers in the interests of chasing tiny increases in their profit margins.

I’m processing a number of emotions as I get up to speed on the implications of these security flaws. Firstly, frustration – any new security weakness causes more work for me as I test our systems, evaluate our vulnerabilities, apply necessary patches, and communicate with clients and colleagues.

Secondly, anger. I’m angry that vendors do not feel that they are bound by any particular obligation to provide their clients with the most secure systems possible, and that in both these cases they have deliberately violated protocols that have been developed over many years specifically to protect personal data from hackers, thieves, spies, corporate espionage, and other malicious actors. I don’t know whether their underlying motivation was greed, malice, or simply stupidity, but whatever the cause, I’m deeply, deeply disappointed. Not just with the companies, but with the specific individuals who chose to create flawed certificates, who chose to install them, who chose to bypass the very systems that we trust to keep us safe, and who chose to lie to consumers about it; telling them that this was ‘value added’ software, designed to ‘enhance their browsing experience’.

Thirdly, though, I’m grateful. We wouldn’t have even known about these flaws without the stirling work of security researchers such as Filippo Valsorda. Watching his twitter stream as the Superfish scandal unfolded was a surreal experience. As far as I can tell, the man neither eats nor sleeps, he just effortlessly creates software, documentation, vulnerability testing code, and informative tweets, with a speed that leaves me not so much envious as awestruck.

And finally, I’m left with a sense of determination. The whole world is connected now, and the Internet is every bit as critical to our global infrastructure as roads, shipping lanes, corporations, and governments. And it is a vital shared resource. If it is to continue to flourish, continue to allow us to communicate, learn, conduct business, share and collaborate, then it must remain a robust, trustable system. And although we have been sadly let down this week by systems vendors, the Internet is bigger than any one company. And our collective need and motivation for it to be a trustable system is greater than the shortsighted greed of any number of individuals.

So I’ll go back to work tomorrow, and I’ll do my best to keep my client’s data secure, their systems running, their information flowing, and I’ll do so grateful for all the work of millions of other hard working developers, systems administrators, hardware designers, and other assorted geeks.

Here’s to the crazy ones.

Bleeding Heartbeats

So, like systems administrators across the planet, I spent the day making sure that the various servers that I’m responsible for are not vulnerable to the “Heartbleed” bug. Now that it’s all over, I’m still quite shaken by the severity of this issue and its long term implications for the security of the internet.

There are already a number of good technical explanations of this bug, such as this one. The flaw can be boiled down to the following:

Many webservers use something called OpenSSL to encrypt communication between the browser and the website.
Recent versions of OpenSSL include a feature called heartbeat, which allows an end user to send a packet of data and the server sends it straight back.
However, if you lie to the server, it can get confused. If you tell it you’re sending a packet of, say, 1000 bytes, but don’t send it any, it will still send you back 1000 bytes of data. And that data will contain all sorts of random stuff that happens to be lying around in memory on the server.
Stuff such as user names, session ids, and query strings.

It was a very, very sobering experience to query a vulnerable webserver and see it return information from another user’s session that would be sufficient for me to impersonate them. This is not supposed to happen.

Really, the only good thing about this whole episode was watching how fast the sysadmin community reacted. There were excellent, in-depth technical discussions on Reddit, Hacker News and elsewhere. Several people wrote and published vulnerability testing software in a matter of hours, such as http://filippo.io/Heartbleed/ from Filippo Valsorda. It was inspiring to watch him build, refine, publish and support an incredibly useful tool overnight, and then scale it out using Amazon Web Services to support the sudden huge demand for the tool.

The longer term implications of all this are unclear. We’ve now had several worrying encryption bugs in the space of a few weeks, including Apple’s embarrassing GOTO fail.

Encryption is the bedrock of the modern internet, and yet we’re learning time and time again that writing good encryption software is hard.

Part of this must have to do with the tools that we use. C is an excellent language for systems level development, but was never designed to be used in a hostile environment. So many security breaches over the years have been caused by the way C requires the programmer to manually manage memory space. The promise of Open Source Software is that with many eyes checking source code, bugs will be found quickly. But the OpenSSL source code is, by all accounts, a mess. Just have a look at the actual lines of code responsible for the heartbleed error and ask yourself whether you would have found the vulnerability while performing a security review.

I’m not sure what all the answers are, but I have a strong feeling that a key response is to start writing more readable software.

Fundamentally, we need a more secure web. The internet is central to our lives, our societies, and our economies. Today I saw what a shaky foundation it all rests on.

What’s in my files? Heads, tails, cats and more.

As soon as you start working on the Linux command line, you have to start working with files. Linux follows a very powerful design philosophy expressed as everything is a file. This can take some getting used to, but is incredibly useful once you get it. Because once you’ve learned how to read and manipulate text files, you can do pretty much anything on your machine.

The first command you need to know is cat. Cat is short for ‘concatenate’, and is used for writing text to and from files. So if I have a file in my current working directory, I can get its contents with cat:


$ echo "hello world" > myfile.txt
$ cat myfile.txt 
hello world

The first line created the file myfile.txt and directed the string “hello world” into it. The second line used cat to read the contents of the file.

Here’s a more real-world example

$ cat /var/log/syslog

This prints out all the lines in the syslog file, present on most unix systems. Don’t worry too much about the contents of the file, it’s probably mostly log records from your system booting up. Now, if you’re inspecting a log file, very often you only want the last few lines, rather than the entire thing. Linux provides a useful command for this very purpose

$ tail /var/log/syslog

Tail, as you might guess, prints out the ‘tail’, or last few lines, of a file. Its companion is head, which prints the first few lines

$ head /var/log/syslog

If you want the ability to view large amounts of text and scroll through it, then the command you want is ‘less‘.

$ less /var/log/syslog

This allows you to scroll up and down through your file. Press q to return to the command line.

Here’s one last neat little trick. I mentioned that on Linux, everything is a file. Even the state of the system is represented as a file.


trevor@vm2:~$ cat /proc/cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 70
model name	: Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz
stepping	: 1
microcode	: 0xf
cpu MHz		: 2294.218
cache size	: 6144 KB
...

Rather than having many different tools for different tasks, a Linux system is more like a box of Lego. The same tools can be plugged into all sorts of different places to enable you to do what you want. I use the humble cat command on a daily basis to modify system settings, read files, and even monitor the health of my machines.

Read the rest of my ‘technology toolkit’ posts:

apt-get: Making your ubuntu machine more better

I pretty much live on the Linux command line. There’s a number of tools I have in my ‘toolkit’ that I use on a daily basis to automate tasks, manage systems and provide features.

apt-get is one of the most important tools to understand if you’re using an Ubuntu distribution.

apt-get is like the Apple App Store, except it’s been around for much longer and everything it provides is absolutely free. When I’m setting up a new Linux machine I immediately download and install several useful packages.


$ sudo apt-get install ipython
$ sudo apt-get install nmap
$ sudo apt-get install mercurial

and so on. Very quickly your new Linux machine can be a database server, a graphic design workstation, or a development engine.

Several apt-get commands are worth knowing

$ apt-get install nmap, for example, installs a powerful port-scanner.

$ apt-get remove nmap removes the package we just installed.

$ apt-get upgrade upgrades all installed packages to the latest version

Even more powerfully,

$ do-release-upgrade upgrades your Ubuntu system when a new one is released. To find out what version of Ubuntu you’re running, I suggest $ lsb_release. On my machine it gives the following output:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 13.10
Release: 13.10
Codename: saucy

Check http://www.ubuntu.com/ to see what the latest release version is.

Have fun!

‘top’ and friends – what’s going on on my system?

Linux gives you all sorts of ways of keeping an eye on what’s going on your server. Here are several that I find useful. Read my post on apt-get to get any of these that may not be installed on your computer.

The grand-daddy of them all is top

This is like Task Manager on Windows or Activity Monitor on Mac. It gives you a quick list of all the processes that are running on your computer, how much RAM and processor power they’re using, and how long they’ve been running for.

top, however has a pretty ugly output format, so it’s good that there are several better alternatives.

My preferred tool is htop, which gives a better visual display, including a nice bar-chart showing CPU usage on all your cores, and the ability to sort by CPU, memory usage etc.


  CPU[||                                                                          1.3%]     Tasks: 44, 46 thr; 1 running
  Mem[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||  332/987MB]     Load average: 0.00 0.01 0.05 
  Swp[|                                                                       0/1021MB]     Uptime: 1 day, 13:00:40

  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
24133 trevor     20   0 26420  2756  1456 R  0.7  0.3  0:00.57 htop
 1030 root       20   0 91728  4412  3572 S  0.0  0.4  2:21.86 /usr/sbin/vmtoolsd
 1051 postgres   20   0  129M  5056  3628 S  0.0  0.5  0:24.74 postgres: writer process

So what’s making my disk thrash?

htop tells us what processes are using memory and processing power. To see what’s using I/O, use iotop. This needs to be run as root:


$ sudo apt-get install iotop
$ sudo iotop

which gives you a list of which processes are reading and writing to disk.

But what’s using my network connection?

Likewise, we can monitor network usage using iftop


$ sudo apt-get install iftop
$ sudo iftop

which shows the open connections on your machine and how much data is being passed through them.

An alternative to iftop is nethogs which shows you which processes are responsible for data being passed over your network connections.

I also find netstat useful in these situations. netstat can do a lot, but one way I find myself using it a lot is

 $ netstat -plnt

which means ‘show me all the processes on this machine that are listening for TCP/IP connections, and which port they are listening on’. This is very helpful when you’re setting up a machine and you want to verify that, say, your webserver or database server is running correctly.

Linux gives you many, many ways to see what’s going on on your system, and we’ve only scratched the surface her. Check out the rest of my technology toolkit posts:

What are my servers doing? ping, mtr and nmap

If you’re a network administrator or web developer of any kind, you are often going to want to know what your machines are doing. Is your database machine connected to the network? Is your firewall working? Is your web server properly configured?

ping is the very first tool that we reach for.

ping simply sends a message to another machine to see if it is responding.

$ ping google.com

When you run this, you should start seeing responses from google.com getting printed out at your command line. If you don’t get this, then your network connection is probably down. This can also be used to see if machines that you own are up and running.

$ ping mywebserver

If ping doesn’t work, and you’re sure that you are connected to the network, the next thing to do is to make sure that you have the right machine name and that your client machine can look up its IP address correctly. That’s where dig comes in.

$ dig mycoolsite.ca

Check that this gives you the right IP address. Be aware that some DNS providers such as OpenDNS may give you back an IP address even for sites that don’t exist, but it won’t be anything useful.

mtr is like ping on steroids – it will show you all the hops that your network packets are taking between you and your target machine, and how responsive each intermediate machine is being. This is a very handy tool for analysing connectivity problems.

My server is up, but what is it doing?

nmap is a vital tool in every network administrators toolkit. It deserves an article all to itself, but here’s a quick taste:

trevor@vm2:~$ nmap localhost 

Starting Nmap 6.40 ( http://nmap.org ) at 2013-11-28 18:00 PST
Nmap scan report for localhost (127.0.0.1)
Host is up (0.000094s latency).
Not shown: 997 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
80/tcp   open  http
5432/tcp open  postgresql

Nmap done: 1 IP address (1 host up) scanned in 0.04 seconds

This tells me that machine vm2 is running ssh, a webserver, and the postgresql database engine. If this machine was visible to the outside world, now would be the time when I would make sure that postgresql wasn’t accepting non-local connections.

As with all these tools, nmap can be installed via

$ sudo apt-get install nmap

Find out how to install packages, keep an eye on your Linux system and other useful tips in my technology toolkit posts:

Four Killer Postgres Extensions

I’ve been using the Postgres database engine for probably 10 years now, and despite having also used Oracle, DB2, Microsoft SQL Server, SQLite, MySQL, Access, FoxPro and others, it’s still by far and away my favourite.

In all my years of using it, I have never once encountered, or even heard of, an incident of data loss or integrity failure. It implements the SQL standard rigourously. And whatever I throw at it, it seems to just keep humming along.

And more than that, it’s extensible. Here are four extensions that take it from merely being the most solid, reliable relational database engine in existence, to also being an incredible modern application development platform. And there easy to install and get started with.

UUID

Install this by simply typing in psql

# CREATE EXTENSION "uuid-ossp";

Since discovering this, I’ve said ‘goodbye’ to SERIAL primary keys forever. Now, my table definitions frequently look like this:

CREATE TABLE thing(
uuid UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
...
)

And there you go, items of type ‘thing’ will now be uniquely identifable GLOBALLY, as well as within his table. I can backup and restore the database without worrying about getting the serial generated correctly, I can merge data from two sources without worrying about conflicts, and I can use that UUID in other code, or in RESTFUL urls, and always be sure what I’m referring to.

pgcrypto

Once again, this can be installed as follows:

# CREATE EXTENSION pgcrypto;

and then you can finally get your password management right, and free of all the embarrassing security errors that have plagued some popular sites in recent years.

I recommend the following structure for a username/password table:

CREATE TABLE app_user(
uuid UUID PRIMARY KEY DEFAULT uuid_generate_v4(), -- just as before!
email text not null,
hashed_password text not null ,
UNIQUE (email)
)

And then populate the hashed_password column by using the handy crypt function from pgcrypto

INSERT INTO app_user (email, hashed_password) VALUES (:email, crypt(:password, gen_salt(‘bf’) ) ) RETURNING uuid;

Then you can check user credentials like this:

SELECT EXISTS (SELECT uuid FROM app_user WHERE email=:email AND hashed_password=crypt(:password,hashed_password));

This has any number of security advantages, and avoids many common pitfalls:

The algorithm has a tunable speed. By choosing a salt of type ‘bf’, the algorithm will be many thousands of times more resistant to brute-force attacks than once based on SHA1, and many many thousands more times resistant to one based on MD5 hashing. See here for more in-depth info.
By creating a new salt for each user, and embedding it in the output hash, the same username/password combination will not result in the same output hash. So even if a malicious attacker had access to this table, he wouldn’t be able to perform hash lookups in rainbow tables. It’s frightening that many sites don’t use this approach yet.
It’s been very thoroughly tested. Although I’ve written plenty of crypto code before, I’d always rather use a widely-tested, discussed and understood implementation. There are so many mistakes that are easy to make and hard to detect in the security world, using a solid open-source library like pgcrypo is just good practice

Hstore

# CREATE EXTENSION hstore;

Hstore allows you to store arbitary key/value pairs in a database column. This is perfect for storing property bags, and in situations where you don’t know at design time exactly what the structure of your data is going to be.

Let’s extend our user table:

CREATE TABLE app_user(
uuid UUID PRIMARY KEY DEFAULT uuid_generate_v4(), -- just as before!
email text not null,
hashed_password text not null,
UNIQUE (email),
properties hstore
)

Now we can assign arbitrary properties to each user:

UPDATE app_user SET properties =  ('twitter_id'=>'sir_tweetsalot', 'comments'=>'some notes here', 'follow_up'=>'1')

and then we can use that hstore field as follows:

SELECT * FROM app_user WHERE app_user.properties->'follow_up'

In fact, with UUID and HStore, Postgres is already looking like a pretty good NoSQL solution, but still with all the traditional SQL benefits of transactional integrity.

PLV8

And then finally, plv8. I’m only beginning to discover how powerful this is, and it really deserves a post of its own. In brief, PLV8 allows you to write stored procedures in Javascript, Coffeescript or LiveScript.

There are all sorts of things that you could do with this. Suffice it to say, last month we were pretty proud of ourselves when we wrote our own dialect of LISP, wrote a parser for it in Coffeescript, and then got the whole thing running inside our Postgres database. Yeah, we were using Lisp on top of Coffeescript to filter SQL records. That’s how we do things around here!

And all this is before Postgres formally gets JSON as a standard data type. I can’t wait!

New site design

I’ve tried to drag this site kicking and screaming into, well, whatever we’re calling this current decade. The goal is a clean, minimalist interface, easy to find articles, and a good user experience on any device.

I’ve also just learned far more about the innards of WordPress than I ever wanted to. Apparently, if you create a child theme, then you override existing PHP pages by creating new pages with the same name in your child theme folder. Unless, of course the php page is functions.php. In that case, both copies get included, causing all sorts of fun and conflicts.

In the end I gave up and edited the parent theme as well, so if I ever update the parent theme I’ll have to make a couple of changes.

Thank goodness for the Chrome WebInspector. I can’t imagine how we used to do web development before we had tools like that.

Nexus 7

I’m loving my new Nexus 7. I’m sitting in Casa Cappuccino drinking coffee and updating my blog. It’s a significant improvement over my old Playbook, for a number of reasons.

Firstly the on screen keyboard is better. A physical keyboard is always going to be more efficient than a touchscreen, but I’m finding that typing seems a lot more fluid on this than the playbook. I’m making less errors, the autocorrect is smarter, and editing existing text is easier. I’m not convinced by gesture typing yet, but with a bit of practice it could be quite handy.

Second, being part of the android ecosystem means there are way more available apps than on the playbook. The nexus 7 is a very nice portable movie viewer, for example.

Third, it’s very accessible as a development platform. It didn’t take long for me to go through the android tutorials and write, build and deploy my first application to the device.

On the downside, I miss the smooth integration with iTunes that I’m used to on iOS devices, and I’m frustrated that Greek fonts don’t render correctly. I love the youVersion bible application, and I wish I could switch easily between the English text and the Greek.

That said, this is definitely the nicest mobile device I’ve used so far, and is significantly cheaper than the iPad mini. Strongly recommended.

Nothing to Hide?

More Technology Toolkit Posts

So what’s making my disk thrash?

But what’s using my network connection?

My server is up, but what is it doing?