public internet

The National Security Agency and its UK counterpart have made repeated and determined attempts to identify people using the Tor anonymity service, but the fundamental security remains intact, as top-secret documents published on Friday revealed.

The classified memos and training manuals—which were leaked by former NSA contractor Edward Snowden and reported by The Guardian, show that the NSA and the UK-based Government Communications Headquarters (GCHQ) are able to bypass Tor protections, but only against select targets and often with considerable effort. Indeed, one presentation slide grudgingly hailed Tor as "the king of high-secure, low-latency Internet anonymity." Another, titled "Tor Stinks," lamented: "We will never be able to de-anonymize all Tor users all the time."

An article published separately by The Washington Post also based on documents provided by Snowden concurred.

"There is no evidence that the NSA is capable of unmasking Tor traffic routinely on a global scale," the report said. "But for almost seven years, it has been trying."

Aurich Lawson (after Aliens)

In one of the more audacious and ethically questionable research projects in recent memory, an anonymous hacker built a botnet of more than 420,000 Internet-connected devices and used it to perform one of the most comprehensive surveys ever to measure the insecurity of the global network.

In all, the nine-month scanning project found 420 million IPv4 addresses that responded to probes and 36 million more addresses that had one or more ports open. A large percentage of the unsecured devices bore the hallmarks of broadband modems, network routers, and other devices with embedded operating systems that typically aren't intended to be exposed to the outside world. The researcher found a total of 1.3 billion addresses in use, including 141 million that were behind a firewall and 729 million that returned reverse domain name system records. There were no signs of life from the remaining 2.3 billion IPv4 addresses.

Continually scanning almost 4 billion addresses for nine months is a big job. In true guerilla research fashion, the unknown hacker developed a small scanning program that scoured the Internet for devices that could be logged into using no account credentials at all or the usernames and passwords of either "root" or "admin." When the program encountered unsecured devices, it installed itself on them and used them to conduct additional scans. The viral growth of the botnet allowed it to infect about 100,000 devices within a day of the program's release. The critical mass allowed the hacker to scan the Internet quickly and cheaply. With about 4,000 clients, it could scan one port on all 3.6 billion addresses in a single day. Because the project ran 1,000 unique probes on 742 separate ports, and possibly because the binary was uninstalled each time an infected device was restarted, the hacker commandeered a total of 420,000 devices to perform the survey.

Read 16 remaining paragraphs

Aurich Lawson

In the 1990s, client-server was king. The processing power of PCs and the increasing speed of networks led to more and more desktop applications, often plugging into backend middleware and corporate data sources. But those applications, and the PCs they ran on, were vulnerable to viruses and other attacks. When applications were poorly designed, they could leave sensitive data exposed.

Today, the mobile app is king. The processing power of smartphones and mobile devices based on Android, iOS, and other mobile operating systems combined with the speed of broadband cellular networks have led to more mobile applications with an old-school plan: plug into backend middleware and corporate data sources.

But these apps and the devices they run on are vulnerable… well, you get the picture. It's déjà vu with one major difference: while most client-server applications ran within the confines of a LAN or corporate WAN, mobile apps are running outside of the confines of corporate networks and are accessing services across the public Internet. That makes mobile applications potentially huge security vulnerabilities—especially if they aren't architected properly and configured with proper security and access controls.

Read 32 remaining paragraphs

It's been six years since I wrote Discussions: Flat or Threaded? and, despite a bunch of evolution on the web since then, my opinion on this has not fundamentally changed.

If anything, my opinion has strengthened based on the observed data: precious few threaded discussion models survive on the web. Putting aside Usenet as a relic and artifact of the past, it is rare to find threaded discussions of any kind on the web today; for web discussion communities that are more than ten years old, the vast majority are flat as a pancake.

I'm game for trying anything new, I mean, I even tried Google Wave. But the more I've used threaded discussions of any variety, the less I like them. I find precious few redeeming qualities, while threading tends to break crucial parts of discussion like reading and replying in deep, fundamental, unfixable ways. I have yet to discover a threaded discussion design that doesn't eventually make me hate it, and myself.

A part of me says this is software Darwinism in action: threaded discussion is ultimately too complex to survive on the public Internet.


Before threaded discussion fans bring out their pitchforks and torches, I fully acknowledge that aspects of threading can be useful in certain specific situations. I will get to that. I know I'm probably wasting my time even attempting to say this, but please: keep reading before commenting. Ideally, read the whole article before commenting. Like Parappa, I gotta believe!

Before I defend threaded discussion, let's enumerate the many problems it brings to the table:

  1. It's a tree.

    Poems about trees are indeed lovely, as Joyce Kilmer promised us, but data of any kind represented as a tree … isn't. Rigid hierarchy is generally not how the human mind works, and the strict parent-child relationship it enforces is particularly terrible for fluid human group discussion. Browsing a tree is complicated, because you have to constantly think about what level you're at, what's expanded, what's collapsed … there's always this looming existential crisis of where the heck am I? Discussion trees force me to spend too much time mentally managing that two-dimensional tree more than the underlying discussion.

  2. Where did that reply go?

    In a threaded discussion, replies can arrive any place in the tree at any time. How do you know if there are new replies? Where do you find them? Only if you happen to be browsing the tree at the right place at the right time. It's annoying to follow discussions over time when new posts keep popping up anywhere in the middle of the big reply tree. And God help you if you accidentally reply at the wrong level of the tree; then you're suddenly talking to the wrong person, or maybe nobody at all. For that matter, it absolutely kills me that there might be amazing, insightful responses buried somewhere in the middle of a reply chain that I will never be able to find. Most of all, it just makes me want to leave and never come back.

  3. It pushes discussion off your screen.

    So the first reply is indented under the post. Fair enough; how else would you know that one post is a reply to another post? But this indentation game doesn't ever end. Reply long and hard enough and you've either made the content column impossibly narrow, or you've pushed the content to exit, stage right. That's how endless pedantic responses-to-responses ruin the discussion for everyone. I find that in the "indent everything to the right" game, there are no winners, only losers. It is natural to scroll down on the web, but it is utterly unnatural to scroll right. Indentation takes the discussion in the wrong direction.

  4. You're talking to everyone.

    You think because you clicked "reply" and your post is indented under the person you're replying to, that your post is talking only to that person? That's so romantic. Maybe the two of you should get a room. A special, private room at the far, far, far, far, far right of that threaded discussion. This illusion that you are talking to one other person ends up harming the discussion for everyone by polluting the tree with these massive narrow branches that are constantly in the way.

    At an absolute minimum you're addressing everyone else in that discussion, but in reality, you're talking to anyone who will listen, for all time. Composing your reply as if it is a reply to just one person is a quaint artifact of a world that doesn't exist any more. Every public post you make on the Internet, reply or not, is actually talking to everyone who will ever read it. It'd be helpful if the systems we used for discussion made that clear, rather than maintaining this harmful pretense of private conversations in a public space.

  5. I just want to scroll down.

    Reddit (and to a lesser extent, Hacker News) are probably the best known examples of threaded comments applied to a large audience. While I find Reddit so much more tolerable than the bad old days of Digg, I can still barely force myself to wade through the discussions there, because it's so much darn work. As a lazy reader, I feel I've already done my part by deciding to enter the thread; after that all I should need to do is scroll or swipe down.

    Take what's on the top of reddit right now. It's a cool picture; who wouldn't want to meet Steve Martin and Morgan Freeman? But what's the context? Who is this kid? How did he get so lucky? To find out, I need to collapse and suppress dozens of random meaningless tangents, and the replies-to-tangents, by clicking the little minus symbol next to each one. So that's what I'm doing: reading a little, deciding that tangent is not useful or interesting, and clicking it to get rid of it. Then I arrive at the end and find out that information wasn't even in the topic, or at least I couldn't find it. I'm OK with scrolling down to find information and/or entertainment, to a point. What I object to is the menial labor of collapsing and expanding threaded portions of the topic as I read. Despite what the people posting them might think, those tangents aren't so terribly important that they're worth making me, and every other reader, act on them.

Full bore, no-holds-barred threading is an unmitigated usability disaster for discussion, everywhere I've encountered it. But what if we didn't commit to this idea of threaded discussion quite so wholeheartedly?

The most important guidance for non-destructive use of threading is to put a hard cap on the level of replies that you allow. Although Stack Exchange is not a discussion system – it's actually the opposite of a discussion system, which we have to explain to people all the time – we did allow, in essence, one level of threading. There are questions and answers, yes, but underneath each of those, in smaller type, are the comments.


Now there's a bunch of hard-core discussion sociology here that I don't want to get into, like different rules for comments, special limitations for comments, only showing the top n of comments by default, and so forth. What matters is that we allow one level of replies and that's it. Want to reply to a comment? You can, but it'll be at the same level. You can go no deeper. This is by design, but remember: Stack Exchange is not a discussion system. It's a question and answer system. If you build your Q&A system like a discussion system, it will devolve into Yahoo Answers, or even worse, Quora. Just kidding Quora. You're great.

Would Hacker News be a better place for discussion if they capped reply level? Would Reddit? From my perspective as a poor, harried reader and very occasional participant, absolutely. There are many chronic problems with threaded discussion, but capping reply depth is the easiest way to take a giant step in the right direction.

Another idea is to let posts bring their context with them. This is one of the things that Twitter, the company that always does everything wrong and succeeds anyway, gets … shockingly right out of the gate. When I view one of my tweets, it can stand alone, as it should. But it can also bring some context along with it on demand:


Here you can see how my tweet can be expanded with a direct link or click to show the necessary context for the conversation. But it'll only show three levels: the post, my reply to the post, and replies to my post. This idea that tweets – and thus, conversations – should be mostly standalone is not well understood, but it illustrates how Twitter got the original concept so fundamentally right. I guess that's why they can get away with the terrible execution.

I believe selective and judicious use of threading is the only way it can work for discussion. You should be wary of threading as a general purpose solution for human discussions. Always favor simple, flat discussions instead.

Venture investors still have a healthy appetite for early-stage consumer Internet companies, but those startups are having a harder time raising follow-on financing.

Overall the amount invested in consumer information services was off 42% in the first nine months as the difficulties of newly public Internet companies such as Facebook and Zynga cast doubt on the business models and valuations of social media companies.

Read the rest of this post on the original site

The new account is unlikely to alter Iran's view of the US, seen here in a mural on the old US embassy in Tehran

David Holt

In 2011, the US government rolled out its "International Strategy for Cyberspace," which reminded us that "interconnected networks link nations more closely, so an attack on one nation’s networks may have impact far beyond its borders." An in-depth report today from the New York Times confirms the truth of that statement as it finally lays bare the history and development of the Stuxnet virus—and how it accidentally escaped from the Iranian nuclear facility that was its target.

The article is adapted from journalist David Sanger's forthcoming book, Confront and Conceal: Obama’s Secret Wars and Surprising Use of American Power, and it confirms that both the US and Israeli governments developed and deployed Stuxnet. The goal of the worm was to break Iranian nuclear centrifuge equipment by issuing specific commands to the industrial control hardware responsible for their spin rate. By doing so, both governments hoped to set back the Iranian research program—and the US hoped to keep Israel from launching a pre-emptive military attack.

The code was only supposed to work within Iran's Natanz refining facility, which was air-gapped from outside networks and thus difficult to penetrate. But computers and memory cards could be carried between the public Internet and the private Natanz network, and a preliminary bit of "beacon" code was used to map out all the network connections within the plant and report them back to the NSA.

Read more

J. Alex Halderman and Nadia Heninger write in with an update to yesterday's story on RSA key security: "Yesterday Slashdot posted that RSA keys are 99.8%
secure in the real world. We've been working on this
concurrently, and as it turns out, the story is a bit more
complicated. Those factorable keys are generated by your router and
VPN, not The geeky details are pretty nifty: we
downloaded every SSL and SSH keys on the internet in a few days, did
some math on 100 million digit numbers, and ended up with 27,000
private keys. (That's 0.4% of SSL keys in current use.) We posted a
blog post summarizing our findings over at Freedom to Tinker."

rhartness writes "I am a long time Software Engineer, however, almost all of my work has been developing server-side, intranet applications or applications for the Windows desktop environment. With that said, I have recently come up with an idea for a new website which would require extremely high levels of security (i.e. I need to be sure that my servers are as 100% rock-solid, unhackable as possible.) I am an experienced developer, and I have a general understanding of web security; however, I am clueless of what is requires to create a web server that is as secure as, say, a banking account management system. Can the Slashdot community recommend good websites, books, or any other resources that thoroughly discuss the topic of setting up a small web server or network for hosting a site that is as absolutely secure as possible?"

Read more of this story at Slashdot.

We've always put a heavy emphasis on performance at Stack Overflow and Stack Exchange. Not just because we're performance wonks (guilty!), but because we think speed is a competitive advantage. There's plenty of experimental data proving that the slower your website loads and displays, the less people will use it.

[Google found that] the page with 10 results took 0.4 seconds to generate. The page with 30 results took 0.9 seconds. Half a second delay caused a 20% drop in traffic. Half a second delay killed user satisfaction.

In A/B tests, [Amazon] tried delaying the page in increments of 100 milliseconds and found that even very small delays would result in substantial and costly drops in revenue.

I believe the converse of this is also true. That is, the faster your website is, the more people will use it. This follows logically if you think like an information omnivore: the faster you can load the page, the faster you can tell whether that page contains what you want. Therefore, you should always favor fast websites. The opportunity cost for switching on the public internet is effectively nil, and whatever it is that you're looking for, there are multiple websites that offer a similar experience. So how do you distinguish yourself? You start by being, above all else, fast.

Do you, too, feel the need – the need for speed? If so, I have three pieces of advice that I'd like to share with you.

1. Follow the Yahoo Guidelines. Religiously.

The golden reference standard for building a fast website remains Yahoo's 13 Simple Rules for Speeding Up Your Web Site from 2007. There is one caveat, however:

There's some good advice here, but there's also a lot of advice that only makes sense if you run a website that gets millions of unique users per day. Do you run a website like that? If so, what are you doing reading this instead of flying your private jet to a Bermuda vacation with your trophy wife?

So … a funny thing happened to me since I wrote that four years ago. I now run a network of public, community driven Q&A web sites that do get millions of daily unique users. (I'm still waiting on the jet and trophy wife.) It does depend a little on the size of your site, but if you run a public website, you really should pore over Yahoo's checklist and take every line of it to heart. Or use the tools that do this for you:

We've long since implemented most of the 13 items on Yahoo's list, except for one. But it's a big one: Using a Content Delivery Network.

The user's proximity to your web server has an impact on response times. Deploying your content across multiple, geographically dispersed servers will make your pages load faster from the user's perspective. But where should you start?

As a first step to implementing geographically dispersed content, don't attempt to redesign your web application to work in a distributed architecture. Depending on the application, changing the architecture could include daunting tasks such as synchronizing session state and replicating database transactions across server locations. Attempts to reduce the distance between users and your content could be delayed by, or never pass, this application architecture step.

Remember that 80-90% of the end-user response time is spent downloading all the components in the page: images, stylesheets, scripts, Flash, etc. This is the Performance Golden Rule. Rather than starting with the difficult task of redesigning your application architecture, it's better to first disperse your static content. This not only achieves a bigger reduction in response times, but it's easier thanks to content delivery networks.

As a final optimization step, we just rolled out a CDN for all our static content. The results are promising; the baseline here is our datacenter in NYC, so the below should be read as "how much faster did our website get for users in this area of the world?"


In the interests of technical accuracy, static content isn't the complete performance picture; you still have to talk to our servers in NYC to get the dynamic content which is the meat of the page. But 90% of our visitors are anonymous, only 36% of our traffic is from the USA, and Yahoo's research shows that 40 to 60 percent of daily vistors come in with an empty browser cache. Optimizing this cold cache performance worldwide is a huge win.

Now, I would not recommend going directly for a CDN. I'd leave that until later, as there are a bunch of performance tweaks on Yahoo's list which are free and trivial to implement. But using a CDN has gotten a heck of a lot less expensive and much simpler since 2007, with lots more competition in the space from companies like Amazon's, NetDNA, and CacheFly. So when the time comes, and you've worked through the Yahoo list as religiously as I recommend, you'll be ready.

2. Love (and Optimize for) Your Anonymous and Registered Users

Our Q&A sites are all about making the internet better. That's why all the contributed content is licensed back to the community under Creative Commons and always visible regardless of whether you are logged in or not. I despise walled gardens. In fact, you don't actually have to log in at all to participate in Q&A with us. Not even a little!

The primary source of our traffic is anonymous users arriving from search engines and elsewhere. It's classic "write once, read – and hopefully edit – millions of times." But we are also making the site richer and more dynamic for our avid community members, who definitely are logged in. We add features all the time, which means we're serving up more JavaScript and HTML. There's an unavoidable tension here between the download footprint for users who are on the site every day, and users who may visit once a month or once a year.

Both classes are important, but have fundamentally different needs. Anonymous users are voracious consumers optimizing for rapid browsing, while our avid community members are the source of all the great content that drives the network. These guys (and gals) need each other, and they both deserve special treatment. We design and optimize for two classes of users: anonymous, and logged in. Consider the following Google Chrome network panel trace on a random Super User question I picked:

data transferred

Logged in (as me)
233.31 KB
1.17 s
1.31 s

111.40 KB
768 ms
1.28 s

We minimize the footprint of HTML, CSS and Javascript for anonymous users so they get their pages even faster. We load a stub of very basic functionality and dynamically "rez in" things like editing when the user focuses the answer input area. For logged in users, the footprint is necessarily larger, but we can also add features for our most avid community members at will without fear of harming the experience of the vast, silent majority of anonymous users.

3. Make Performance a Point of (Public) Pride

Now that we've exhausted the Yahoo performance guidance, and made sure we're serving the absolute minimum necessary to our anonymous users – where else can we go for performance? Back to our code, of course.

When it comes to website performance, there is no getting around one fundamental law of the universe: you can never serve a webpage faster than it you can render it on the server. I know, duh. But I'm telling you, it's very easy to fall into the trap of not noticing a few hundred milliseconds here and there over the course of a year or so of development, and then one day you turn around and your pages are taking almost a full freaking second to render on the server. It's a heck of a liability to start 1 full second in the hole before you've even transmitted your first byte over the wire!

That's why, as a developer, you need to put performance right in front of your face on every single page, all the time. That's exactly what we did with our MVC Mini Profiler, which we are contributing back to the world as open source. The simple act of putting a render time in the upper right hand corner of every page we serve forced us to fix all our performance regressions and omissions.


(Note that you can click on the SQL linked above to see what's actually being run and how long it took in each step. And you can use the share link to share the profiler data for this run with your fellow developers to shame them diagnose a particular problem. And it works for multiple AJAX requests. Have I mentioned that our open source MVC Mini Profiler is totally freaking awesome? If you're on a .NET stack, you should really check it out. )

In fact, with the render time appearing on every page for everyone on the dev team, performance became a point of pride. We had so many places where we had just gotten a little sloppy or missed some tiny thing that slowed a page down inordinately. Most of the performance fixes were trivial, and even the ones that were not turned into fantastic opportunities to rearchitect and make things simpler and faster for all of our users.

Did it work? You bet your sweet ILAsm it worked:


That's the Google crawler page download time; the experimental Google Site Performance page, which ostensibly reflects complete full-page browser load time, confirms the improvements:


While server page render time is only part of the performance story, it is the baseline from which you start. I cannot emphasize enough how much the simple act of putting the page render time on the page helped us, as a development team, build a dramatically faster site. Our site was always relatively fast, but even for a historically "fast" site like ours, we realized huge gains in performance from this one simple change.

I won't lie to you. Performance isn't easy. It's been a long, hard road getting to where we are now – and we've thrown a lot of unicorn dollars toward really nice hardware to run everything on, though I wouldn't call any of our hardware choices particularly extravagant. And I did follow my own advice, for the record.

I distinctly remember switching from AltaVista to Google back in 2000 in no small part because it was blazing fast. To me, performance is a feature, and I simply like using fast websites more than slow websites, so naturally I'm going to build a site that I would want to use. But I think there's also a lesson to be learned here about the competitive landscape of the public internet, where there are two kinds of websites: the quick and the dead.

Which one will you be?

