13 episodes

Agilicus. Learn. Do. Teach‪.‬ Don Bowman

- Business

Zero Trust Network Architecture

- 14 APR 2021
The strong password, the breach, and the multi-factor save

The strong password, the breach, and the multi-factor save

I performed my period audit of my accounts. And, to my surprise, I found the password for my rubygems was in the breach corpus. The 2nd-factor caught the save, but… the password was generated via pwgen 12 (so it looked like aibeaNongoo0). I think you will agree that was not guessed somehow. So, on this topic, when was the last time you opened chrome://settings/passwords/check?start=true and checked your accounts for safety? Well, read the next couple of paragraphs and then go to it.

There’s a spectrum of password strength. One the one end, some people use something guessable (a pet, a birthday), and, they re-use the password on many (all?) sites.

Next we have those who have a strong(ish) password, but reuse that across multiple sites.

And then we have the strongest password approach: every site gets its own password, and they are strongly generated. This necessitates a password manager (I use KDE Wallet, which stores in my GPG keyring).

Layered on top of this is a multi-factor strategy. Interestingly, it dramatically imrpoves all three strategies. The breach of the bad password does not have your 2nd-factor code generator in it. These end up being uncorrelated risks, and the combination is very strong.

However, this is becoming very tedious. Over the years one accumulates hundreds of online accounts. Some merchants force one to buy a single product. A few forums here and there, suddenly you have two or three hundred accounts to audit. Changing the password across them is no mean feat.

So what is the solution? My very strong password was breached on a single site, saved only by the 2nd factor. Well, in my opinion, the answer is to remove the password. That’s right. Rather than make it 16 characters long, I want to go to 0. Use a single common identity provider, via OpenID Connect. Secure that appropriately (strong password + 2nd-factor). And force each and every web site I use to accept it for authentication (without sharing the password).

OK, gentle reader, now your homework. Open chrome://settings/passwords/check?start=true. Check yourself out on https://haveibeenpwned.com/. In your browser, in the saved passwords, if it flags any, fix it. That means going to the web site in question, and changing the password to a new single-use one, and enabling multi-factor authentication if available.
- 2 min
- 18 OCT 2020
Take time to stop and sniff the mime type

Take time to stop and sniff the mime type

My first involvement with HTTP and web came in 1992. Challenged to create a MUSH as a means of delivering online education, the zeitgeist of the time of information and Internet came through and I built a browser and web server. I had never seen or heard of web before, the closest i had seen was Veronica and Gopher and of course Archie. Archie was access via telnet, and was kind of far from graphical.

The HTTP 0.9 protocol was not yet known as that, and was exceptionally simple. You would telnet to port 80 on some host, type ‘GET /path‘, and it would return as-is. If you knew what to do w/ the result, you were good. Initially it was thought that only text would be used (no fonts, no css, no images), so this was fine.

In the system I built (CALVIN, Computer-Aided Learing Vision Information Network), a C++-based fork+serve web server managed the file serving. All files were treated equally, the path you gave was the path it served. An X-Windows + Motif-based client with a simple HTML widget was the other end of this, running on a Decstation 3100. While implementing this I had an idea. Why not guess, based on the file extension, the type? This way I could handle an image and invent some sort of image-tag for HTML The img-tag had not yet been invented (and the specs, such as they were, were nowhere easy to find), I think i chose rather than which was later standardised.

So I forged ahead. I did some sort of strtok() on the file name, looked at the string after the dot, if it was jpg or gif or pnm, would render appropriately. Life was simple then. Got the project done, did the presentation, got the grade, got out. The X interface leaked memory like a sieve so the demo was short 🙂

Fast forward to 2020. The standards evolved somewhat, and, a header called Content-Type now exists for this purpose. The server is responsible for telling the client how to interpret content. And, a well behaved client should never guess what to do based on the extension (sorry 1992 me). You see, since 1992, the web had become a less simple, less safe space. Malicious actors discovered they could send active content to be evaluated by Internet Explorer’s aggressive mime-type-guessing algorithm, and thus gain control of the desktop.

HIstory suggests that, for each new security hole in HTTP, a new header is created. And, this flaw was no exception. Enter the X-Content-Type-Options header. In proper use, one adds:

X-Content-Type-Options: nosniff

to the HTTP response. The browser, on receipt, decides to listen to the server solely, and not its internal algorithm. Security achieved!

Fast forward to today. As an experiment in magic proxy forwarding zero-trust mumbo jumbo logic, I exposed my printer to the Internet (only for authenticated users with valid roles,
- 3 min
- 5 OCT 2020
Delicious Dogfood: Cloud Native WordPress

Delicious Dogfood: Cloud Native WordPress

Agilicus upgraded our web site infrastructure, and there was only one way to go: full Cloud Native. Cloud Native means many small components, stateless, scaling in and out, resilient as a whole rather than individually. As a consequence, we made design decisions for database and storage. Let’s talk about that!

First, WordPress. It has been around for a long time. WordPress architecture dates to an era when ‘Cloud’ meant a virtual machine at best. You ran a single WordPress instance, a single database, and the storage was tightly coupled to the WordPress. Fancy folk used a fileserver with the storage, and 2 WordPress instances with shared storage. But, you always had a few infinitely reliable and scalable components to deal with (storage, database). Very few native WordPress installs run at high scale, most instead use either headless CMS outputing static files, or front-ended by CDN’s and a double dash of hope.

When I first started Agilicus I installed WordPress under Kubernetes (GKE), but only just. I had a cluster with non-preemptible nodes. I ran a mysql pod with a single PVC and a WordPress pod with a single PVC. Scale? Forgetaboutit. Resilience? Well, when the node went down, the web site went down.

Clean sheet. We counsel our customers to run on our platform, why not re-imagine our web site, our front-door, our public face on the same platform? Eat the delicious dogfood!

So, what does Dogfooding mean? It means no cheating. Remove all limitations. Make all the state scalable, cloud native. Simple single sign-on.

The architecture of WordPress has a few complexities for a Cloud Native world:

* Plugins have unlimited, direct access to database and filesystem

* User can install plugins from web front-end

* Content is hybrid local-file and database

* Plugins modify content (e.g. scale images, compress css)

* Database must be Mysql, its hard coded in everywhere

* Login infrastructure designed for local storage in database

OK, I got this. Let’s bring in a few tools to help. First, the database. For this we will use tidb. It presents a Mysql facade, is built on tikv which in turn is based on the Raft consensus algorithm. Raft is quite magic, and powers many Cloud Native components (etcd, cockroachdb, …). The Raft algorithm allows individual members to be unreliable, to come and go, but to have the overall set be consistent, reliable. Its bulletproof.

To deploy tidb we will use an Operator, allowing us to scale the database up and down, upgrade it. Now we can upgrade the database without any running impact, add capacity. Brilliant!

Now that the database is solved, in a Kubernetes, Cloud-Native way, on to the storage. This is considerably tougher, there is no Read-Write-Many storage in Google GKE. So, what can we do? I considered using Glusterfs. I’ve previously tried NFS. Terrible. Turns out there is a plugin for WordPress called wp-stateless. It causes all the images etc to be stored into Google Cloud Storage (GCS), and thus be accessible to all the other Pods (and the end user). Solved!

Moving on,
- 6 min
- 3 OCT 2020
Web Application Security 101: Get the basics right

Web Application Security 101: Get the basics right

Web Application Security 101The Basics.tpgb-block-fefcea_211715.tpgb-heading-title .seprator { margin-left: 0; margin-right: auto; }@media (max-width:1024px){.tpgb-block-fefcea_211715.tpgb-heading-title .seprator { margin-left: 0; margin-right: auto; }}

CSRF? CSP? CORS?

CONTACT ✉

Web Application Security is complex to get perfect, but easy to get better than average. I have a thesis: if you have not tried to secure anything in the easy category, the security culture of your organisation suggests the more complex things won’t be done well either. One of the tools I use to assess this security 101 is the Mozilla Observatory. Sure, it doesn’t check everything, but if you have a 0 here, you likely are not putting in the effort anywhere.

In this presentation (and video below) I talk a little bit about the “Do what I say” security concept for a web site owner. The ‘What I say’ is encoded as a set of Headers (Content-Security-Policy, XSS-*, Cross Origin Request, CAA). I show you how to go from bad to good in a small amount of effort.

My call to action: Learn these Web Application Security 101 techniques. Apply them to a site you own or influence. Teach someone else about them. Let’s pay it forward.

https://youtu.be/pKlN2tp4mvsVideo can’t be loaded because JavaScript is disabled: Web Application Security 101 (https://youtu.be/pKlN2tp4mvs)
- 20 min
- 30 SEPT 2020
Zero Trust: Connecting The Digitally Disconnected

Zero Trust: Connecting The Digitally Disconnected

OVERVIEW

Your organisation has cascading sets of people it interacts with. In the core, there are full-time employees. They have badges, access cards, accounts, organisationally-issued hardware. They use the IT-managed hardware and software to achieve their job, including a VPN to access services remotely. You create IT-managed identities, often in systems like Google G Suite or Microsoft Active Directory.

The next tranche of team members are contractors. Indeed, these users you might treat most no differently than the full-time staff. But some contractors are in specific job roles which do not require them to have IT-managed hardware or accounts. They may be specialists who work outside the building. These users might have no corporately-managed identity. Examples might include Transit drivers, Janitorial services.

After these people we have team members that are even more digitally-disconnected. Seasonal temporary workers. Temporary consultants. Workers from affiliated but arms-length organisations. In a Municipal environment these could include lifeguards for the pool, workers with the Library system, or local Social Services providers.

Traditionally these other tiers of users were ignored from an IT standpoint. Paystubs were delivered on paper, policies were posted on a bulletin board. Some organisations would use shared-accounts on Kiosk (shared) computers for online learning management systems.

Covid-19 has accelerated the thinking around these users. How can we furlough users, tell them to “check the Intranet” for details on what has changed/when they can come back to work if they have no access to the Intranet? How can we ask them to use a mail-drop for their pay stubs or timesheets if we are asking them not to come in the building?

Identity management (Authentication) and role-management (Authorisation) are the two key disciplines we need to improve if we are to solve the issue of connecting the digitally disenfranchised.

Zero Trust Architecture

A Zero Trust architecture allows us to have seamless access to any resource, from any device, for any user, from any network. And, does it more securely. Zero Trust splits the User Identity from the User Authorisation. It moves from a perimeter-based security practice to a fine-grained user & resource control.

Zero Trust (as defined by NIST SP 800-207) is a term for evolving cybersecurity from static network perimeter-based security (e.g. VPN) to an architecture that focuses on the user(identity) and the resource(authorisation).

The core requirements:

* Simple, secure, Identity. Make it trivial for you users to login with a single username/password, single-sign-on, multi-factor authentication.

* Decouple authorisation from Identity and from each Application.

Once these are achieved you can simply, securely, move access to individual systems to the users who need them. Those digitally disenfranchised users can access that corporate Intranet, including if their employment has been suspended, including if they have no corporate email address, device, VPN.

Evolving beyond the VPN

For many years the VPN was the gold standard of remote security. You kept your inside network isolated except for a few users with curated software on managed devices.

The VPN has a large cost. Managing the client software. It’s a stateful device, it does not scale well as we add users.
- 7 min
- 6 JUN 2020
The False Choice of Risk Versus Reach

The False Choice of Risk Versus Reach

Scroll to the bottom for the video covering this topic.

Security features are often disabled because of interactions with older devices and software. There is a relationship between the cost of upgrading those devices and the cost associated with the risk you cannot mitigate without. Many view this risk as linear. If we draw a graph with risk on the left, and number of users on the bottom, we might think the graph looks like this. As we increase the users addressed, we are forced to reduce security to accommodate their older devices.

However, this is a false view. In reality there is an 80:20 rule for most things, here no exception. Recognising that 80% of our users will be using Chrome or Firefox, and, that most of these will be on the last 1 or 2 versions, we can re-draw our graph. We can see that for a constant risk, we can reach the majority of our users. From here the risk grows more rapidly than the number of users reached since we are forced to start disabling features for ever increasingly small groups. Worse, those risks affect both groups (ones with new, and ones with old, software).

This brings up the question, what is really on the table for any change, and, can we make it in strictly economic terms? If we can price the risk, put it in terms of $, it is easier to see, perhaps it is cheaper to upgrade old devices.

Consider a hypothetical organisation. It has invested in the past in smart TV for the meeting rooms. These smart TV are no longer upgradable and only support TLS 1.0 with RC4. This causes the organisation to leave these older security standards enabled on its corporate services, increasing the risk for all users, all devices. Which would cost more? $1000/smart TV, or, a breach in data and the associated reputational damage?

I would like to challenge another assumption, that this curve is linear to begin with. I suggest it’s more Z-shaped, and, that if we could truly assess it, we would design our process and procedures around the 2nd knee in the curve here. Anything passed it, those devices are not worth the risk of reaching.

I gave one example above (TLS version), but there are many such design choices (upgrade of client-software, upgrade of upstream libraries such as jQuery, enabling 2-factor authentication, etc).

Now, you may think this concept of expressing risk in $, and user-reach in $ to be abstract. But I assure you it’s real. It allows you to compare two things that are fungible, to decide where to best spend to obtain the maximal risk:reward ratio.

Let’s try an example. Let’s open https://ssllabs.com/ssltest in another tab. Now, on the right hand side, let’s select one of the Recent Worst (or a Recent Best that is less than an A). Feel free to test your own site of course. If we scroll down, we will see the Handshake Simulation and the various client versions. The one I picked was www.surugabank.co.jp. As you can see, it received an F. Is this because of a desire to support old devices? Its doubtful, it seems this bank just doesn’t care.

So let’s maybe select something with a bit higher score. For this I chose licensemanager.sonicwall.com. Here we can see that older protocols are indeed in use, albeit set up correctly. RC4, Weak Diffie-Hellman, TLS 1.0

If we scroll down to the Handshake Simulation, we can see the reason. Many old devices are supported, and some force the use of weak parameters.