Are you the publisher? Claim or contact us about this channel

Embed this content in your HTML


Report adult content:

click to rate:

Account: (login)

More Channels

Channel Catalog

Articles on this Page

(showing articles 1 to 50 of 50)
(showing articles 1 to 50 of 50)

    0 0
  • 01/30/16--14:50: Nick Kew: Stealth Debit
  • Last January I gave my dad a gift subscription to The Economist for his birthday.  He had been a subscriber for many years, but somehow lost it when his life was dominated by an altogether more serious problem.  It’s the ideal birthday present for someone who’s never been easy to buy for: not merely absolutely right for him, but also something that can be repeated each year thereafter.

    A week ago he ‘phoned me, having noticed that the end date of his subscription had moved a year, to January 2017.  Great, that’s exactly as intended, but he wondered if I’d renewed.  In fact I hadn’t: I’d been awaiting contact from The Economist about renewal.  Hmm … if they haven’t asked either of us to pay, who do they suppose is paying?  Or do they have one of those billing departments that gets into a terrible mess?

    Checking my bank accounts, I find I had indeed set up a direct debit, and yesterday it was debited for another year’s subscription.  OK, fine, but isn’t it customary to send at least a courtesy email notifying me ahead of a direct debit?  Not a big issue: I’d intended the payment anyway and had ample funds in the account.  But I’m mildly p***ed off not to have been warned.

    Perhaps they fear losing a subscription?  That would put them in the same game as scammers who seek to sign you up by stealth to something you don’t want.  Not a happy thought.

    0 0

    I've been cautioned and advised by several good friends that I should take a chill pill on commenting about various political things. Some of the topics I've been quite vocal about are high profile things involving high power people .. and I might be beginning to get noticed by them, which of course is not a good thing!

    I get frustrated by political actions that I find to be stupid and I don't hesitate to tell it straight the way I think about it. Obviously every such statement bothers someone else. Its one thing when its irrelevant noise, but if it gets noisy then I'm a troublemaker.

    I'm not keen to get to that state.

    Its not because I have anything to hide or protect - not in the least. Further I'm not scared off by the PM telling private sector people like me to "go home" or "be exposed" but publicly naming private individuals in parliament is rather over the top IMO. Last thing I want is to get there.

    I have an immediate family and an extended family of 500+ in WSO2 that I'm responsible for. I'm taping up my big mouth for their sake.

    Instead I will try to blog constructively & informatively whenever time permits.

    Similarly I will try to keep my big mouth controlled about US politics too. Its really not my problem to worry about issues there!

    I should really kill off my FB account. However I do enjoy getting info about friends and family life events and FB is great for that. So instead I'll stop following everyone except for close friends and family.

    Its been fun and I like intense intellectual debate. However, maybe another day - just not now.

    (P.S.: No, no one threatened me or forced me to do this. I just don't want to come close to that possibility!)

    0 0

    Last week, I showed how to use the Apache CXF Fediz IdP as an identity broker with a real-world SAML SSO IdP based on the Shibboleth IdP (as opposed to an earlier article which used a mocked SAML SSO IdP). In this post, I will give similar instructions to configure the Fediz IdP to act as an identity broker with Keycloak.

    1) Install and configure Keycloak

    Download and install the latest Keycloak distribution (tested with 1.8.0). Start keycloak in standalone mode by running 'sh bin/'.

    1.1) Create users in Keycloak

    First we need to create an admin user by navigating to the following URL, and entering a password:

    • http://localhost:8080/auth/
    Click on the "Administration Console" link, logging on using the admin user credentials. You will see the configuration details of the "Master" realm. For the purposes of this demo, we will create a new realm. Hover the mouse pointer over "Master" in the top left-hand corner, and click on "Add realm". Create a new realm called "realmb". Now we will create a new user in this realm. Click on "Users" and select "Add User", specifying "alice" as the username. Click "save" and then go to the "Credentials" tab for "alice", and specify a password, unselecting the "Temporary" checkbox, and reset the password.

    1.2) Create a new client application in Keycloak

    Now we will create a new client application for the Fediz IdP in Keycloak. Select "Clients" in the left-hand menu, and click on "Create". Specify the following values:
    • Client ID: urn:org:apache:cxf:fediz:idp:realm-A
    • Client protocol: saml
    • Client SAML Endpoint: https://localhost:8443/fediz-idp/federation
    Once the client is created you will see more configuration options:
    • Select "Sign Assertions"
    • Select "Force Name ID Format".
    • Valid Redirect URIs: https://localhost:8443/*
    Now go to the "SAML Keys" tab of the newly created client. Here we will have to import the certificate of the Fediz IdP so that Keycloak can validate the signed SAML requests. Click "Import" and specify:
    • Archive Format: JKS
    • Key Alias: realma
    • Store password: storepass
    • Import file: stsrealm_a.jks
    1.3) Export the Keycloak signing certificate

    Finally, we need to export the Keycloak signing certificate so that the Fediz IdP can validate the signed SAML Response from Keycloak. Select "Realm Settings" (for "realmb") and click on the "Keys" tab. Copy and save the value specified in the "Certificate" textfield.

    2) Install and configure the Apache CXF Fediz IdP and sample Webapp

    Follow a previous tutorial to deploy the latest Fediz IdP + STS to Apache Tomcat, as well as the "simpleWebapp". Test that the "simpleWebapp" is working correctly by navigating to the following URL (selecting "realm A" at the IdP, and authenticating as "alice/ecila"):
    • https://localhost:8443/fedizhelloworld/secure/fedservlet
    2.1) Configure the Fediz IdP to communicate with Keycloak

    Now we will configure the Fediz IdP to authenticate the user in "realm B" by using the SAML SSO protocol. Edit 'webapps/fediz-idp/WEB-INF/classes/entities-realma.xml'. In the 'idp-realmA' bean:
    • Change the port in "idpUrl" to "8443". 
    In the 'trusted-idp-realmB' bean:
    • Change the "url" value to "http://localhost:8080/auth/realms/realmb/protocol/saml".
    • Change the "protocol" value to "urn:oasis:names:tc:SAML:2.0:profiles:SSO:browser".
    • Change the "certificate" value to "keycloak.cert".
    2.2) Configure Fediz to use the Keycloak signing certificate

    Copy 'webapps/fediz-idp/WEB-INF/classes/realmb.cert' to a new file called 'webapps/fediz-idp/WEB-INF/classes/keycloak.cert'. Edit this file + delete the content between the "-----BEGIN CERTIFICATE----- / -----END CERTIFICATE-----" tags, pasting instead the Keycloak signing certificate as retrieved in step "1.3" above.

    The STS also needs to trust the Keycloak signing certificate. Copy keycloak.cert into 'webapps/fediz-idp-sts/WEB-INF/classes". In this directory import the keycloak.cert into the STS truststore via:
    • keytool -keystore ststrust.jks -import -file keycloak.cert -storepass storepass -alias keycloak
    Restart Fediz to pick up the changes (you may need to remove the persistent storage first).

    3) Testing the service

    To test the service navigate to:
    • https://localhost:8443/fedizhelloworld/secure/fedservlet
    Select "realm B". You should be redirected to the Keycloak authentication page. Enter the user credentials you have created. You will be redirected to Fediz, where it converts the received SAML token to a token in the realm of Fediz (realm A) and redirects to the web application.

    0 0

    sharkbait posted a photo:

    The Shrine

    The Bowie mural in Brixton
    Still popular, still growing
    Very touching

    0 0
  • 02/03/16--05:09: Ian Boston: Ai in FM
  • Limited experience in either of these fields does not stop thought or research. At the risk of being corrected, from which I will learn, I’ll share those thoughts.

    Early AI in FM was broadly expert systems. Used to advise on hedging to minimise overnight risk etc or to identify certain trends based on historical information. Like early symbolic maths programs (1980s) that revolutionised the way in which theoretical problems can be solved (transformed) without error in a fraction of the time, early AI in FM put an expert with a probability of correctness on every desk. This is not the AI I am interested in. It it only artificial in the sense it artificially encapsulates the knowledge of an expert. The intelligence is not artificially generated or acquired.

    Machine learning  covers many techniques. Supervised learning takes a set of inputs and allows the system to perform actions based on a set of policies to produce an output. Reinforcement learning favors the more successful policies by reinforcing the action. Good machine, bad machine. The assumption is, that the environment is stochastic. or unpredictable due to the influence of randomness.

    Inputs and outputs are simple. They are a replay of the historical prices. There is no guarantee that future prices will behave in the same way as historical, but that is in the nature of a stochastic system.  Reward is simple. Profit or loss. What is not simple is the machine learning policies. AFAICT, machine learning, for a stochastic system with a large amount of randomness, can’t magic the policies out of thin air. Speech has rules, Image processing also and although there is randomness, policies can be defined. At the purests level, excluding wrappers, financial markets are driven by the millions of human brains attempting to make a profit out of buying and selling the same thing without adding any value to that same thing. They are driven by emotion, fear and every aspect of human nature rationalised by economics, risk, a desire to exploit every new opportunity, and a desire to be a part of the crowd. Dominating means trading on infinitesimal margins exploiting perfect arbitrage as it the randomness exposes differences. That doesn’t mean the smaller trader can’t make money, as the smaller trader does not need to dominate, but it does mean the larger the trader becomes, the more extreme the trades have to become maintain the level of expected profits. I said excluding wrappers because they do add value, they adjust the risk for which the buyer pays a premium over the core assets. That premium allows the inventor of the wrapper to make a service profit in the belief that they can mitigate the risk. It is, when carefully chosen, a fair trade.

    The key to machine learning is to find a successful set of policies. A model for success, or a model for the game. The game of Go has a simple model, the rules of the game. Therefore it’s possible to have a policy of, do everything. Go is a very large but ultimately bounded Markov Decision Process (MDP). Try every move. With trying every move every theoretical policy can be tested. With feedback, and iteration, input patterns can be recognised and successful outcomes can be found. Although the number of combinations is large, the problem is very large but finite. So large that classical methods are not feasible, but not infinite so that reinforcement machine learning becomes viable.

    The MDP governing financial markets may be near infinite in size. While attempts to formalise will appear to be successful the events of 2007 have shown us that if we believe we have found finite boundaries of a MDP representing trade, +1 means we have not. Just as finite+1 is no longer finite by the original definition, infinite+1 proves what we thought was infinite is not. The nasty surprise just over the horizon.

    0 0

    My Dad's 70th birthday was January 16, 2015. Years ago, we started talking about celebrating this event and suggested we go somewhere warm. Trish and I proposed Hawaii, Mexico, or even Cuba. My Dad had his own idea: he wanted to go to Jekyll Island, Georgia. He spent a few high school years in Brunswick, GA and had fond memories of the place and weather. He also wanted to drive there, because road trips are awesome. At least they are in my family. We've done many family road trips over the years; the last one was in July.

    One of the great joys in owning a VW Vanagon Westfalia is having the ability to sleep anywhere. Home is where your van is. We did a bunch of van repairs and upgrades last fall: rebuilt the transmission, added modern headlights, installed a Truck Fridge and a Propex heater. It was finally in tip-top shape for winter camping (or a road trip), so I suggested to my parents that we drive it to Georgia, via New Orleans. They agreed and we all smiled with thoughts of visiting the Big Easy. We'd never been.

    Sunrise on the first day of Raible Road Trip #70 My parents flew to Denver a few days before we started. Our journey began early Sunday morning, January 10th. We left Denver around 6am and I spent the first couple hours driving in the dark, eastbound on I-70. Within the first hundred miles, the van's odometer quit working, so we had to rely on the gas gauge to know when to fill up. For anyone that's owned a vanagon, you'll know their gas gauges are fairly unreliable. Tom Hanks even talked about this on the David Letterman show.

    My mom took the second shift and when she asked me if we had enough gas to make it another 50 miles, I say "yep, it sure looks like it". 20 minutes later, we were on the side of I-70, out of gas. We grabbed the empty jerry can off the back and started walking to the next exit, which wasn't far, but it was cold. Within a couple minutes, a trucker stopped to give us a ride. I hopped in and my mom walked back to the van. The trucker, Don, was super cool and drove me to a nearby gas station and back. It was the first time I'd ridden in a semi truck and it turned out to be a pretty cool experience. We decided to stop for gas whenever the fuel gauge hit the orange zone from then on.

    Filling up the gas tank since we ran out of gas on the first tank!Riding in the back seat of a Westy through Kansas

    We spent the first night with my Aunt Donna and Uncle Dale, in Clever, Missouri. We had a lot of fun visiting with them and talked for hours, both before bed and the next morning.

    Aunt Donna, Mom, Dad and Uncle Dale - awesome people!

    I spent that night in the van and tried out the new Reflectix insulation my dad and I made the day before. It was 12°F outside that night and plenty warm inside the van.

    The next day, we drove a few hours to Hot Springs, Arkansas. We marveled at the bathhouses from a bygone era and stayed in a Hot Springs National Park KOA.

    Cards in the vanOur campsite in Hot Springs, AR. 21F that night.

    Tuesday, we drove south, across the mighty Mississippi to New Orleans. We arrived at Ponchartrain Landing, just after dark and enjoyed dinner at the RV Park's restaurant. We were impressed with the food and the live music. We caught an Uber into the French Quarter and enjoyed live music from the Treme Brass Band at d.b.a..

    We made it to New Orleans!The land of great music! Thanks for the recommendation Keith!Our Setup

    The next day, we spent touring the French Quarter, learning about its history and basking in the soulful atmosphere.

    Happy CoupleMy parents and their colorful shirts

    A good depiction of the night

    Thursday, we continued our journey, driving to Panama City, Florida and stopping to see the USS Alabama along the way.

    USS Alabama

    When we started driving on Friday, the Syncro had a loud knocking in the left year. I posted this to the Vanagon Owners group on Facebook and they quickly diagnosed it as a bad CV joint. The noise went away at highway speeds.

    We made it to Jekyll Island by late afternoon and I took a quick nap before picking up Trish and the kids from the Jacksonville airport around midnight.

    My dad's brother, Jim, and sister, Mary, arrived in Jekyll Island before we did. Jim's wife Maryann shares a birthday with my dad, so we celebrated on Saturday with cards, laughter and lots of smiles. We rented bikes from a shop nearby and had a blast biking around the island and walking on the beaches.

    We loved the bike ridesAnd selfiesPhoto bomb!

    One of the highlights of the weekend was doing Jack's science experiment. He'd brought a kit to grow bacteria in Petri dishes so we rode our bikes around the island and stopped at various restaurants. We asked kitchen managers to swab their countertops for us, then we captured the result in different dishes. It was a warm sunny day and all the restaurants were very helpful. We completed our mission in time to watch the Broncos beat the Steelers that evening.

    Happy Broncos Family

    I returned Trish and the kids to the airport on Monday and they flew back to Denver. I worked remotely for the rest of the week while my parents, aunts and uncle spent long hours together exploring the island. They talked about memories of their youth, took a lot of walks together, did some shopping and did what older people do: watched The Weather Channel. A historic snowstorm was headed for the east coast and it could alter flights home to New York, as well as our route back to Denver.

    Last sunset in Jekyll Island

    On the road again! Heading to the Broncos tailgate. Left Friday morning at 6am, hoping to make it to Denver for the Broncos tailgate on Sunday. Our trip back to Denver began early Friday morning, January 22nd. I was eager to get home for the AFC Championship game on Sunday. We dropped my Aunt Mary off at the airport in Jacksonville around 7am and drove through Florida, staying south on I-10 to avoid the storm. There was lots of rain and wind, but only for a few hours. We made it to Shreveport, Louisiana that evening and stayed at a KOA. We made it 900 miles the first day. With 960 miles left, we knew Saturday would be rough, but we could make it back in time for the game!

    Saturday was sunny in Texas and we spent most of the day driving through it.

    TexasSyncro Sunset

    The knocking noise from the CV came back, but we knew there was no stopping now. We pulled into Denver around 11pm that night, finishing our 4000 mile road trip to Jekyll Island and back.

    What a fun trip! It certainly helped that we had no breakdowns. Yes, we lost a hubcap, the CV needed to be replaced and the odometer stopped working. However, these were all easy to fix and the odometer even started working again on the way home. We made it back in time for the Broncos game and Trish and I had a blast at the tailgate before watching the Broncos beat the Patriots.

    We made it to Denver at 11pm Saturday. Broncos tailgate on Sunday!

    One other highlight of the trip was that The Bus was this close to being finished. The shop working on it was sending me pictures the whole time and set the expectation that it would be finished within days. Unfortunately, they discovered some issues while driving it (dead battery, leaks in the gas tank), so they determined there was still work to do. Nevertheless, they invited us down for a test drive. My parents and I drove down Monday afternoon, hopped in it for a ride and quickly broke down within a couple miles. The battery was charged enough to start it, but when the engine died at a light, there was no juice left. We pushed it off to the side of the road, called for help to get it jump-started and drove it back to the shop. With all the snow on the ground in Denver, I'm OK if it takes a few more weeks to complete.

    First test drive. Had to push it off the road and get a jump; dead battery.

    More on Flickr → Raible Road Trip #70 Album

    I also named both our VWs on this trip. The '66 21-Window will now be known as "Hefe", and we're calling the '90 Syncro Westy "Stout". Hefe can be attributed to its German roots and one of my favorite beer styles, Hefeweizen. In Spanish, El Jefe means "the boss", so that's fitting too. As for Stout, both Trish and I have Irish roots, we love stout beers and we've turned it into a thick and strong vehicle.

    Now it's February. It's been dumping snow in the mountains and we completed our first winter camping trip with the kids last weekend. The Broncos are in the Super Bowl this weekend. Life couldn't get much better right now. ;)

    0 0

    This new pre-release introduces two backwards incompatible changes over 2.0.0-alpha-03:

    • the mapping of DifferenceEngine#setNamespaceContext has been inverted from prefix -> URI to URI -> prefix in order to be consistent with the same concept in XPathEngine.
    • the SchemaURI propetty of Validator has been removed in XMLUnit.NET and pushed to ParsingValidator in XMLUnit for Java.

    Additional changes:

    • CommentLessSource uses an XSLT stylesheet internally which lacked the required version attribute in XMLUnit for Java. PR #47 by @phbenisc.
    • Comparison now also contains the XPath of the parent of the compared nodes or attributes which is most useful in cases of missing nodes/attributes because the XPath on one side is null in these cases. Issue #48 implemented via PR #50 by @eguib.

    This is still an alpha release as the API may well change based on your feedback. Please provide feedback about the API in case it needs to get adapted before the final release on the xmlunit-general list or via GitHub issues.

    XMLUnit 2.0.0-alpha-04 is available as GitHub release via and respectively.

    XMLUnit for Java is also available via Maven Central.


    with additional artifacts xmlunit-matchers and xmlunit-legacy.

    XMLUnit.NET is also available as nuget packages XMLUnit.Core, XMLUnit.NUnit2.Constraints and XMLUnit.NUnit3.Constraints.

    For more information please visit

    0 0

    That error message from qemu wants to tell you that you need to compile it with gnutls support enabled. Set the gnutls USE flag on Gentoo.

    0 0

    I really don't blog often these days. It's been busy, for sure, but in the best way possible. I also find myself using short-form sites, such as Twitter, more often. On December 10th, Lisa and I had a baby :) Ella is an amazing, healthy, fast-growing baby girl. I'll spare everyone the new parent talk. We're lucky, and we appreciate it. I took about three weeks completely off to spend at

    0 0
  • 02/07/16--06:16: David Reid: React Mixins
  • Having written a simple test app I’m continuing to use it to try and develop my React knowledge :-) One aspect that I did find a little annoying was the amount of code I seemed to repeat. I kept thinking that I should have a base class and simply inherit from it – a pattern I have used a lot in other languages, but this is React and nothing I had seen suggested that pattern.

    Enter the Mixin

    A react mixin is a simple idea, a block of code that is common to one or more components. After looking at them I found it was possible to extract a lot of common functionality, resulting in this code.

    var AppointmentMixin = {
      componentDidMount: function() {
        this.props.updateInterval(, this.totalTime());
      setAdditional: function(ev) {
        this.state.additional = parseInt(;
        this.props.updateInterval(, this.totalTime());
      totalTime: function() {
        return this.state.duration + this.state.additional;
      finishTime: function() {
        return this.props.start.clone().add(this.totalTime(), "minutes");

    There is nothing in the above code that is specific to a component, it’s all plain generic code. To use it in a component you need to add a ‘mixins’ line and remove the old code. This now gives me a component that looks like this.

    var Hair = React.createClass({
      mixins: [ AppointmentMixin],
      getInitialState: function() {
        return {duration: 90, additional: 0}
      render: function() {
        return (
            <h3>Hair Appointment</h3>
            <p>Start: {this.props.start.format("HH:mm")}</p>
            <p>Duration: {this.state.duration} minutes</p>
            <p>Additional duration: <input type="number" step="1" ref="additional" 
                                           onChange={this.setAdditional}/> minutes</p>
            <p>Total Time Required: {this.totalTime()} minutes</p>
            <p>Finish: {this.finishTime().format("HH:mm")}</p>

    This is much closer to what I wanted.

    Uh oh…

    While looking around for information on mixins I cam across this line repeatedly.

    Unfortunately, we will not launch any mixin support for ES6 classes in React. That would defeat the purpose of only using idiomatic JavaScript concepts.

    This looked as if support would be coming, but then I found this post and also this.

    Higher Order Component – huh?

    So looking at some posts about the concept helped me get a better understanding of what it’s trying to do, I decided to try and change my example to use it. Sadly it didn’t work out as I’ve been unable to get the higher order component solution working in a manner akin to a mixin. It’s not so much a replacement but a totally different approach that requires things be done differently.

    However, always keen to learn, I rewrote things and ended up with this.

    function TimedAppointment(duration, title) {
      const Appointment = React.createClass({
        getInitialState: function() {
          return {duration: duration, 
                  additional: 0,
                  title: title}
        componentDidMount() {
          this.props.updateInterval(, this.totalTime());
        setAdditional(ev) {
          this.state.additional = parseInt(;
          this.props.updateInterval(, this.totalTime());
        totalTime() {
          return this.state.duration + this.state.additional;
        finishTime() {
          return this.props.start.clone().add(this.totalTime(), "minutes");
        render() {
          return (
              <p>Start: {this.props.start.format("HH:mm")}</p>
              <p>Duration: {this.state.duration} minutes</p>
              <p>Additional duration: <input type="number" step="1" ref="additional" 
                                             onChange={this.setAdditional}/> minutes</p>
              <p>Total Time Required: {this.totalTime()} minutes</p>
              <p>Finish: {this.finishTime().format("HH:mm")}</p>
      return Appointment;
    var Hair = TimedAppointment(90, "Hair Appointment");
    var Nails = TimedAppointment(30, "Manicure");

    This is much neater and certainly gives me a single set of reusable code – no mixins required. It’s possibly far closer to where it should be and is still within my limited comfort zone of understandability.

    If anyone cares to point out where I could have gone or what I could have done differently, please let me know :-)


    As time goes on I’m sure that the newer formats of javascript will become more of a requirement and so mixins probably won’t be much use going forward.

    0 0

    After using webpack for a few days, the attraction of changing to using the dev server are obvious.

    The webpack-dev-server is a little node.js Express server, which uses the webpack-dev-middleware to serve a webpack bundle.


    Oddly enough, it needs installed via npm! However, as we’re going to run it from the command line, we’ll install it globally.

    sudo npm install -g webpack-dev-server


    After install, simply running the server (in the same directory as the webpack.config.js file) will show it’s working and the bundle is built and made available. Next step is to get it serving the HTML file we’ve been using. This proves to be as simple as

    $ webpack-dev-server --content-base html/

    Requesting the page from gives the expected response. Removing the bundled files produced by webpack directly from the html directory and refreshing the page proves the files are being loaded from the dev server. Nice.

    Hot Loading

    Of course, having the bundle served by webpack is only the start – next I want any changes I make to my React code to be reflected straight away – hot loading! This is possible, but requires another module to be loaded.

    npm install --save-dev react-hot-loader

    The next steps are to tell webpack where things should be served, which means adding a couple of lines to our entry in webpack.config.js.

      entry: [
        path.resolve(__dirname, 'components/App.js'),

    As I’m planning on running this from the command line I’m not going to add a plugin line as some sites advise, but rather use the ‘–hot’ command line switch. I may change in future, but at present this seems like a better plan.

    The final step needed is to add the ‘react-hot’ loader, but this is where things hit a big snag. The existing entry for js(x) files looked like this.

            test: /components\/.+.jsx?$/,
            exclude: /node_modules/,
            loader: 'babel-loader',
            query: {
              presets: ['react', 'es2015']

    Adding the loader seemed simple (remembering to change loader to loaders as there was more than one!).

            test: /components\/.+.jsx?$/,
            exclude: /node_modules/,
            loaders: ['react-hot', 'babel-loader'],
            query: {
              presets: ['react', 'es2015']
    Error: Cannot define 'query' and multiple loaders in loaders list

    Whoops. The solution was given by reading various posts and eventually I settled on this. It works for my current versions of babel but may not work for future ones. All changes below are applied to the webpack.config.js file.

    Add the presets as a variable before the module.exports line.

    var babelPresets = {presets: ['react', 'es2015']};

    Change the loader definition to use the new variable and remove the existing definition.

            test: /components\/.+.jsx?$/,
            exclude: /node_modules/,
            loaders: ['react-hot', 'babel-loader?'+JSON.stringify(babelPresets)],

    Now, when running webpack-dev-server –content base html/ –hot everything is fine and the page is served as expected.

    Editing one of the components shows the expected rebuild of the bundle when saved – exactly as expected.

    All Change!

    As I tried to get this working I discovered that the react-hot-plugin is being deprecated. Until it happens I’m happy with what I have, but the author promises to have a migration guide.


    To try and keep things simpler and avoid the inevitable memory lapses leading to scratching of head about lack of hot reloading, I’ve added a line to the package.json file. With this added I can now simply type npm run dev and the expected things will happen.

    "scripts": {
        "test": "echo \"Error: no test specified\"&& exit 1",
        "build": "webpack --progress",
        "dev": "webpack-dev-server --content-base html/ --hot"

    0 0

    Is there an easy way to add simple generated data tables from CSV or the like using the Apache CMS system for the website? I.e. I want to checkin a CSV (or other simple table of data) that certain committers can edit via a spreadsheet, and then display selected rows from that table on an webpage in some semi-pretty manner.

    Did you know that the ASF has their own CMS / static generator / magic update system that runs the homepage and many Apache project homepages? While it’s more of an Apache infra tool rather than a full Apache top level project, it’s still a full service solution for allowing multiple static website builds that are integrated into our servers.

    While there are plenty of great technical CMS systems, when choosing a system for your company, many of the questions are organizational and deployment related. How easy is it for your IT team to manage the core system? How easy is it for various teams (or projects) to store and update their own content, perhaps using different templates in the system? How can you support anonymous editing/patch submission from non-committers? Does it support a safe and processor-respectful static workflow, minimizing the load on production servers while maximizing backups? And how can you do all this with a permissive license, and only hosting your own work?

    The Apache CMS – while a bit crufty – supports all these things (although the infra peeps might argue about the maintenance part!) Everything’s stored in SVN, so restoring a backup or bringing the production server back is just checking the tree out again. Many projects use a Markdown variant, although some projects configure in their own static generator tools. The web GUI, while sparse, does have a great tutorial for submitting anonymous patches to Apache websites.

    My question is: what’s the simplest way to have an top level webpage pull in some sort of simple data source? In particular, I don’t want to have to maintain much code, and I only want to add this data table bit within an existing page, without having to run my own whole generation script.

    The first specific use case is displaying /foundation/marks/list/registered, a normal a.o page that will display a data table of all the registered trademarks the ASF owns. I’ll checkin a CSV that I get from our counsel that includes all the legal details of our trademarks.

    Bonus points for a simple system that:

    • Can pull some columns from a separate table: namely, projects.a.o descriptions from the projects.
    • Can pull my CSV listing trademark numbers from a private repo (committers or foundation).
    • Uses Python or JS and not Perl.

    Note: I have cut back my $dayjob recently, so I will actually have time to write some of the code for this work myself now – finally!

    0 0

    Website Brand Review of Apache Mesos

    How do open source projects get popular? By providing some useful functionality that users want to have. How do open source projects thrive over the long term? By turning those users into contributors who then help improve and maintain the project. How well a project showcases themselves on the web is an important part of the adoption and growth cycle.

    Here’s my quick review of the Apache Mesos project, told purely from the point of view of a new user finding the project website. Mesos is turning into a major project in the big data and cloud space; not perhaps the obvious popularity of Apache Spark yet, but certainly big.

    What Is Apache Mesos?

    Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively.

    In other words, Mesos is built like the Linux kernel in terms of managing resources for user programs, but at a much higher level of abstraction. Mesos provides APIs for resource management and scheduling across both datacenters and the cloud.

    No, Really, What Is Apache Mesos For?

    Mesos allows you to run a variety of applications across the various machines of a datacenter or in the cloud, at small or very large scales. That is, Mesos is the manager that orchestrates allocating CPUs, storage, memory, etc. to the various component applications that you want to actually run. In many ways it’s like an operating system for the cloud. Mesos doesn’t do your big data work, but it can ensure your processing runs can be reliably managed across the datacenter or cloud instances that are doing the work.

    Mesos provides master nodes for managing a cluster, with slave nodes on all resources. It then controls deploying your applications (which can be in Docker or other containers) – things like Hadoop, Spark, or ElasticSearch jobs – onto the various slaves. In particular it ensures each running task gets the resources it requires and manages deployment and execution.

    Mesos supports a solid variety of apps and containers, and allows for fairly fine-grained allocation, both with a default model, as well as allowing you to program your own allocation models and priorities. You can run multiple types of frameworks supporting different types of jobs on the same Mesos cluster.

    New User Website Perceptions

    That is, what does a new user see “above the fold” when first coming to the Apache Mesos project homepage? For their first impression, is it easy to find things, and is the design appealing and easy to follow?

    The homepage is simple in design and text, but has just basic description of functionality. Key links for Getting Started, Docs, and an obvious Download are right at the top. A prominent News section features releases, and Follow Us points to the official Twitter feed. There aren’t any code examples or many specific details of what Mesos architecture looks like on the homepage.

    Getting Started has specific steps, but mostly for download and install; that is, actual examples of use are pointed to elsewhere on subpages. Documentation provides a laundry list of links to procedures, but little overview material. While the information is clear, it often assumes a high degree of technical expertise from users. Once someone reads through the docs it’s clear what Mesos can do, but users not experienced with cluster management or scheduler software will have a hard time seeing just what it is.

    Community pages and the how-to contribute guides are straightforward, and include nicely detailed steps for submitting patches or finding mailing lists.

    UI Design is simple but integrates well with the Mesos logo. UI design is consistent across all major subpages, with simple but useful context navigation sidebars.

    Apache Branding Requirements

    Apache projects are expected to manage their own affairs, including making all technical and content decisions for their code and websites. However to ensure a small modicum of consistency – and to ensure users know that an Apache project is hosted at the ASF – there are a few requirements all Apache projects must include in their websites (or wikis, etc.)

    • Apache Mesos is used in the header and a few places on the homepage, but is not consistently used on other pages in the website.
    • Website navigation links to ASF pages included in sitewide header navigation.
    • Trademark attributions and link included in footers; logo includes TM.
    • DOAP file exists.
    • Powered By Mesos page(s) include simple lists of major users and products using/integrating with Mesos.

    SEO / Search Hits / Related Sites

    Well, SEO is far outside of our scope (and debatable in usefulness anyway), but it’s interesting to see: how does a new user find the Apache Mesos homepage when they were searching?

    Searching for “mesos”:

    Top hit: homepage
    Second hit: wikipedia
    Other hits: variety of Mesos related sites, information

    Searching for “mesos software”:

    A wide variety of how to, what it is, and technical pages about Mesos.

    Other major uses of our Mesos brand in domain names:

    Mesosphere: a commercial company that uses Apache Mesos Testing infrastructure for Mesos deployments
    MesosCon: annual conference about the project (also has @mesoscon)

    There is a “Mesosphere DCOS” datacenter operating system software product, which is built on top of Apache Mesos software. Please note that this is an unusual case, and should not be used as an example for any other companies or software products.

    Social Media Presence

    Many open source projects have a social media presence – although often not as polished or consistent a presence as a consumer or commercial brand presence would have.

    • official, listed on homepage.
    • official, auto-tweeting from Planet Apache.
    • “Apache Mesos Users”, 500+ members.
    • has some traffic.
    • is active.

    What Do You Think Apache Mesos Is?

    So, what do you think? Is Mesos going to be the next big thing that should manage all your Dockers, Hadoops, and S3 buckets? Or are you focusing on Kubernetes, OpenStack/CloudStack, or some specific vendor’s magic offering?

    Note: I’m writing here as an individual, not wearing any Apache hat. I hope this is useful both to new users and to the Apache Mesos community, not necessarily a call to change anything. I haven’t used Mesos for any real deployments myself, so please do comment with corrections to anything I’ve messed up above!

    0 0
  • 02/10/16--08:18: David Reid: sass
  • Continuing my delve into React, webpack and so on and after adding a bunch of css files, I decided it was time to join the 21st century and switch to one of the css preprocessors. LESS was all the rage a few years ago, but now sass seems to have the mindshare and so I’m going to head down that route.


    Oddly enough, it installs via npm :-)

    npm install --save-dev node-sass
    npm install --save-dev sass-loader

    Webpack Config

    The webpack config I’m using is as detailed in my post More Webpack, so the various examples I found online for adding sass support didn’t work as I was already using the ExtractTextPlugin to pull all my css into a single file. The solution turned out to be relatively simple and looks like this.

            test: /\.scss$/,
            loader: ExtractTextPlugin.extract(['css', 'sass'])

    Additionally I need to add the .scss extension to the list of those that can be resolved, so another wee tweak.

      resolve {
        extensions: ['', '.js', '.jsx', '.css', '.scss']


    One reason for moving to SASS is to allow me to split the huge css files into more manageable chunks, but how to arrange this? Many posts on the matter have pointed me to SMACSS and I’m going to read through the free ebook (easily found via a web search) to see what inspiration I can glean, but I think for each React component I’d like to keep the styles alongside as the bond between the JSX and the styling is very tight and changing one will probably require thinking about further changes. As per previous experiments, the component can then require the file and it will magically appear in the bundled, generated css file, regardless of whether I’ve written it in sass or plain css.

    For the “alongside” files I’ll use the same filename and the leading underscore that tells sass not to output the file directly, though with the webpack setup that isn’t a concern now but getting into the habit is likely a good idea for the future :-) This means fora component in a file named App.js I’ll add _App.scss and add a line require(‘_App.scss’); after the rest of the requires.


    I want to use a central variables file for the project, which I can then reference in the sass files, but haven’t quite decided where it should live just yet. Hopefully after reading the ebook and looking at the project a bit more it will make sense.

    Now sass handling in place it’s time to start pulling apart my monolithic plain css file and creating the smaller sass files.

    0 0

    LXC aka Linux Containers are a convenient way to run a light weight Virtual Machine. LXC provides a complete operating system with access to devices attached to host machine. Let us see how we can access an Android device from a LXC instance via adb or fastboot. I assume you have a working LXC with networking setup properly. I am using a LXC named 'test-lxc' which is a Debian sid based container (root@test-lxc:/#) and a Google Nexus 4 as android device with debug mode enabled. My host machine (stylesen@harshu:~$) is a Debian sid based Thinkpad.When I plug in the USB cable from the android device to my host machine I could see the following in the lsusb output:

    stylesen@harshu:~$ lsusb
    Bus 002 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
    Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
    Bus 001 Device 007: ID 04f2:b217 Chicony Electronics Co., Ltd Lenovo Integrated Camera (0.3MP)
    Bus 001 Device 005: ID 147e:2016 Upek Biometric Touchchip/Touchstrip Fingerprint Sensor
    Bus 001 Device 021: ID 18d1:4ee0 Google Inc.
    Bus 001 Device 008: ID 0835:1601 Action Star Enterprise Co., Ltd
    Bus 001 Device 003: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
    Bus 001 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
    Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

    From the above we can see my Nexus 4 (Google Inc.) is connected in USB bus 001 as device 021. The actual path of the Nexus 4 device translates to the following:


    Within my LXC, though the Nexus 4 appears in lsusb output as follows, adb or fastboot does not have access to this device yet:

    root@test-lxc:/# lsusb
    Bus 002 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
    Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
    Bus 001 Device 007: ID 04f2:b217 Chicony Electronics Co., Ltd Lenovo Integrated Camera (0.3MP)
    Bus 001 Device 005: ID 147e:2016 Upek Biometric Touchchip/Touchstrip Fingerprint Sensor
    Bus 001 Device 021: ID 18d1:4ee0 Google Inc.
    Bus 001 Device 008: ID 0835:1601 Action Star Enterprise Co., Ltd
    Bus 001 Device 003: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
    Bus 001 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
    Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

    Both fastboot and adb cannot see the device as shown below:

    root@test-lxc:/# fastboot devices
    root@test-lxc:/# adb devices
    List of devices attached


    In order to make this device accessible from within the container, use the following command on the host machine:

    stylesen@harshu:~$ sudo lxc-device -n test-lxc add /dev/bus/usb/001/021

    Once the above command is run, we can access the Nexus 4 via fastboot or adb as follows:

    root@test-lxc:/# fastboot devices
    04f228d1d9c76f39    fastboot
    root@test-lxc:/# fastboot reboot

    finished. total time: 3.011s

    Every time the Nexus 4 is disconnected from USB port and reconnected which also includes a 'reboot' or 'reboot-bootloader', the device number within the USB bus changes, though the bus number remains the same. For example, for every reboot or disconnection the device path will become something like the following:

    after reboot:  /dev/bus/usb/001/022

    after reboot:  /dev/bus/usb/001/023


    after reboot: /dev/bus/usb/001/0NN

    With the above it is difficult to automate things and also every time you must see output of lsusb to identify the device number and add it to the container with lxc-device command. To make things simple and definite, I have the following udev rule in /etc/udev/rules.d/51-android.rules

    SUBSYSTEM=="usb", ATTR{idVendor}=="18d1", ATTR{idProduct}=="4ee2", ATTRS{serial}=="04f228d1d9c76f39", MODE="0666", GROUP="plugdev", SYMLINK+="android-nexus4"

    Note ATTRS{serial} and SYMLINK+="android-nexus4" which can help us to identify and create a symlink to the Nexus 4 device consitently without worrying about the USB device number on the bus. With the above in place we get a device as follows:

    stylesen@harshu:~$ ls -alh /dev/android-nexus4
    lrwxrwxrwx 1 root root 15 Feb 11 11:36 /dev/android-nexus4 -> bus/usb/001/022

    Now things should be simple to add the above android device to the container with the following command:

    stylesen@harshu:~$ sudo lxc-device -n test-lxc add $(sudo readlink -f /dev/android-nexus4)

    Within the the container we can access the Nexus 4 via adb as follows:

    root@test-lxc:/# adb devices
    List of devices attached
    04f228d1d9c76f39    device

    NOTE1: lsusb is a command which is available via usbutils package in Debian.

    0 0

    Having started moving to sass for my project and including the required bits in my webpack configuration (blog post), the next issue I ran into was that importing didn’t seem to work as expected.

    Require not Import?

    One of the additions I made to my webpack config was to add a resolve section, allowing me to use more convenient and simpler require lines in my javascript.

      resolve: {
        modulesDirectories: ['node_modules', 'components', 'css', 'fonts'],
        extensions: ['', '.js', '.jsx', '.css', '.scss']

    This worked exactly as expected wherever I used a require statement, so I had expected that this would transfer to import statements in css and sass files – but it didn’t. As it seems such an obvious thing to do, I had another look at the README for the sass-loader and found what I was looking for.

    ~, but not as you know it

    For my testing I had created as simple a file as I could think of, test.scss.

    @import ('../components/_component.scss')

    This very simple file just imports another file (which happens to be sass) that belongs to a component I have in the ‘components’ directory. Nothing special, but why do I need the full import path? This was what I needed to get things working, but after looking at the sass-loader again I realised that using the ‘~’ would use the webpack resolve routines – which is what I was hoping. A small change to the file,

    @import ('~_component.scss')

    resulted in things working as I wanted.

    NB the README cautions against using ~ as you may expect (if you’re a command line groupie) as using ~/ implies the home directory and probably isn’t what you want.

    Multiple Outputs?

    Having decided that I don’t want css to be included in the javascript output, I added the ExtractText plugin which allowed me to bundle all css into a single css file. This is fine, but what if I wanted to have different css bundles? What if I wanted to have different javascript bundles? My current configuration didn’t seem to allow this.

      entry: {
        'webpack-dev-server/client?', // WebpackDevServer host and port
        path.resolve(__dirname, 'components/App.js'),

    Thankfully, webpack has this covered. Instead of having a single entry you can have multiple, each of which you can supply a name. Additionally I realised that the entry point doesn’t *need* to be a javascript file as long as it’s a file that can be processed. So I changed the entry section to this.

      entry: {
        bundle: [
          'webpack-dev-server/client?', // WebpackDevServer host and port
          path.resolve(__dirname, 'components/App.js'),
        test: [
          path.resolve(__dirname, 'css/app.scss'),

    Running webpack didn’t give me what I expected as I also needed to change the output definition.

      output: {
        path: path.resolve(__dirname, 'html'),
        filename: '[name].js'

    Using the [name] simply replaces the name I used in the entry definition with that text, which offers additional possibilities. With the changes made, running webpack produces


    The test.js file is a little annoying and in an ideal world it wouldn’t be created, but so far I can’t find any way of preventing it from being created.

    To control the output location even more, simply changing the definitiion is all that’s required for simple changes. Updating it to

      entry: {
        'css/test': [
          path.resolve(__dirname, 'css/app.scss'),

    results in the files being created in html/css, ie


    NB when using a path the name needs to be in quotes.

    Using this setup, component css is still included in the bundle.css and the only things appearing in test.css are those that I have specifically included in the entry file, which opens up a lot of possibilities for splitting things up. As I’m using bootstrap for the project one possibility is to use this to output a customised bootstrap file.

    Hot Reload

    At present hot reloading of css doesn’t seem to be working. I changed my configuration to this

      entry: {
        'webpack-dev-server/client?', // WebpackDevServer host and port
        bundle: [
          path.resolve(__dirname, 'components/App.js'),
        test: [
          path.resolve(__dirname, 'css/app.scss'),

    which still provides hot reloading of the javascript, but the css files don’t seem to work. This seems to be a common issue, but as it’s not a serious one for me at present I’m not going to spend too much time looking for solutions. If anyone knows, then I’d love to hear from you.

    0 0

    Website Brand Review of Apache HBase

    How do open source projects get popular? By providing some useful functionality that users want to have. How do open source projects thrive over the long term? By turning those users into contributors who then help improve and maintain the project. How well a project showcases themselves on the web is an important part of the adoption and growth cycle.

    Here’s my quick review of the Apache HBase project, told purely from the point of view of a new user finding the project website. HBase is a key part of the big data storage stack, so although you may not work directly with it, it’s probably underlying some systems you use.

    What Is Apache HBase?

    “Apache HBase&trade; is the Hadoop database, a distributed, scalable, big data store”.

    In other words, HBase is a non-relational database meant for massively large tables of data that is implicitly distributed across clusters of commodity hardware. HBase provides “linear and modular scalability” and a variety of robust administration and data management features for your tables, all hosted atop Hadoop’s underlying HDFS file system.

    No, Really, What Is Apache HBase For?

    HBase is a solid and basic database or data store for massive amounts of data. As the documentation says: “If you have hundreds of millions or billions of rows, then HBase is a good candidate”. HBase provides a fairly simple set of NoSQL style put/scan/delete commands for your data, so it’s not a rich set of database functionality. But it is integrated tightly with HDFS, Hadoop, and ZooKeeper, and is built to be distributed and scalable by default. Just add new nodes, and you linearly scale both storage and processing power automatically.

    HBase offers a command shell, Java APIs, and REST APIs for managing everything, along with a variety of other integrations with popular big data storage, management, and processing/manipulation packages.

    New User Website Perceptions

    That is, what does a new user see “above the fold” when first coming to the Apache HBase project homepage? For their first impression, is it easy to find things, and is the design appealing and easy to follow?

    The homepage is very simple in design and text, with a quick overview, Download link, and listing of features. The top navbar has links to everything else that comes with a Maven site build, plus a detailed list of links into the Documentation book the project produces. The documentation book seems very thorough and well edited, but the website for it (different navigation) was fairly slow in responding. The documentation book itself is massive, and covers topics from setup, programming, performance, scaling, troubleshooting, and much more.

    UI Design is very simple overall and consistent; the main website uses a basic Maven build and the documentation book uses a separate system. There’s no obvious “Help Contribute” link on the main website, but once you get into the documentation book there are detailed sections for coding styles and submitting patches. The team also uses ReviewBoard and JIRA heavily.

    The website is generated by Apache Maven.

    Apache Branding Requirements

    Apache projects are expected to manage their own affairs, including making all technical and content decisions for their code and websites. However to ensure a small modicum of consistency – and to ensure users know that an Apache project is hosted at the ASF – there are a few requirements all Apache projects must include in their websites (or wikis, etc.)

    • Apache HBase is used in the header and consistently prominently on most of the website, and is carefully &trade; attributed.
    • Website navigation links (except not Security!) to ASF pages included in different places on sitewide header navigation.
    • Logo does not include TM; footers do not include a trademark attribution.
    • DOAP file exists.
    • Powered By HBase page includes simple lists of major users of HBase in a clean layout.
    • Homepage includes prominent links to Export Control as well as a Code of Conduct, which is nice to see.

    SEO / Search Hits / Related Sites

    Well, SEO is far outside of our scope but the question is: how does a new user find the Apache HBase homepage when they were searching?

    Searching for “HBase”:

    Top hit: homepage
    Second hit: wikipedia
    Other hits: variety of HBase related sites, many from vendors discussing integration with their products.

    Searching for “HBase software”:

    A wide variety of how to, what it is, and tutorial pages about HBase.

    Social Media Presence

    Many open source projects have a social media presence – although often not as polished or consistent a presence as a consumer or commercial brand presence would have.

    • appears official, not very active, not linked on homepage.
    • “HBase” group, 6600+ members
    • “Apache HBase” has some traffic.
    • is somewhat active.

    What Do You Think Apache HBase Is?

    So, what do you think? Is HBase something you’d use standalone for your own purposes, or do you expect most users simply use Hadoop or other higher-level tools to manage their tables? How important is the clean separation of the parts of the Hadoop stack, between HDFS, HBase, and the data distribution and management layers on top of them?

    Note: I’m writing here as an individual, not wearing any Apache hat. I hope this is useful both to new users and to the Apache HBase community, not necessarily a call to change anything. I haven’t used HBase for any real deployments myself, so please do comment with corrections to anything I’ve messed up above!

    0 0

    This is the fourth and final article in a series of posts on support for Javascript Object Signing and Encryption (JOSE) in Apache CXF. The first article covered how to sign content using JWS, while the second article showed how to encrypt content using JWE. The third article described how to construct JWT Tokens, how to sign and encrypt them, and how they can be used for authentication and authorization in Apache CXF. In this post, we will show how the CXF Security Token Service (STS) can be leveraged to issue and validate JWT Tokens.

    1) The CXF STS

    Apache CXF ships with a powerful and widely deployed STS implementation that has been covered extensively on this blog before. Clients interact with the STS via the SOAP WS-Trust interface, typically asking the STS to issue a (SAML) token by passing some parameters. The STS offers the following functionality with respect to tokens:

    • It can issue SAML (1.1 + 2.0) and SecurityContextTokens.
    • It can validate SAML, UsernameToken and BinarySecurityTokens.
    • It can renew SAML Tokens
    • It can cancel SecurityContextTokens.
    • It can transform tokens from one type to another, and from one realm to another.
    Wouldn't it be cool if you could ask the STS to issue and validate JWT tokens as well? Well that's exactly what you can do from the new CXF 3.1.5 release! If you already have an STS instance deployed to issue SAML tokens, then you can also issue JWT tokens to different clients with some simple configuration changes to your existing deployment.

    2) Issuing JWT Tokens from the STS

    Let's look at the most common use-case first, that of issuing JWT tokens from the STS. The client specifies a TokenType String in the request to indicate the type of the desired token. There is no standard WS-Trust Token Type for JWT tokens as there is for SAML Tokens. The default implementation that ships with the STS uses the token type "urn:ietf:params:oauth:token-type:jwt" (see here).

    The STS maintains a list of TokenProvider implementations, which it queries in turn to see whether it is capable of issuing a token of the given type. A new implementation is available to issue JWT Tokens - the JWTTokenProvider. By default tokens are signed via JWS using the STS master keystore (this is controlled via a "signToken" property of the JWTTokenProvider). The keystore configuration is exactly the same as for the SAML case. Tokens can also be encrypted via JWE if desired. Realms are also supported in the same way as for SAML Tokens.

    The claims inserted into the issued token are obtained via a JWTClaimsProvider Object configured in the JWTTokenProvider. The default implementation adds the following claims:
    • The current principal is added as the Subject claim.
    • The issuer name of the STS is added as the Issuer claim.
    • Any claims that were requested by the client via the WS-Trust "Claims" parameter (that can be handled by the ClaimManager of the STS).
    • The various "time" constraints, such as Expiry, NotBefore, IssuedAt, etc.
    • Finally, it adds in the audience claim obtained from an AppliesTo address and the wst:Participants, if either were specified by the client.
    The token that is generated by the JWTTokenProvider is in the form of a String. However, as the token will be included in the WS-Trust Response, the String must be "wrapped" somehow to form a valid XML Element. A TokenWrapper interface is defined to do precisely this. The default implementation simply inserts the JWT Token into the WS-Trust Response as the text content of a "TokenWrapper" Element.
      3) Validating JWT Tokens in the STS

      As well as issuing JWT Tokens, the STS can also validate them via the WS-Trust Validate operation. A new TokenValidator implementation is available to validate JWT tokens called the JWTTokenValidator. The signature of the token is first validated by the STS truststore. Then the time related claims of the token are checked, e.g. is the token expired or is the current time before the NotBefore time of the token, etc.

      A useful feature of the WS-Trust validate operation is the ability to transform tokens from one type to another. Normally, a client just wants to know if a token is valid or not, and hence receives a "yes/no" response from the STS. However, if the client specifies a TokenType that doesn't corresponds to the standard "Status" TokenType, but instead corresponds to a different token, the STS will validate the received token and then generate a new token of the desired type using the principal associated with the validated token.

      This "token transformation" functionality is also supported with the new JWT implementation. It is possible to transform a SAML Token into a JWT Token, and vice versa, something that could be quite useful in a deployment where you need to support both REST and SOAP services for example. Using a JWT Token as a WS-Trust OnBehalfOf/ActAs token is also supported.

      0 0

      The Apache SystemML team is pleased to announce the release of Apache SystemML version 0.9.0-incubating. This is the first release as an Apache project. Apache SystemML provides declarative large-scale machine learning (ML) that aims at flexible specification of ML algorithms and automatic generation of hybrid runtime plans ranging from single-node, in-memory computations, to distributed computations on Apache Hadoop MapReduce and Apache Spark. Extensive updates have been made to the release in several areas. These include APIs, data ingestion, optimizations, language and runtime operators, new algorithms, testing, and online documentation. See the RELEASE NOTES for more details about the release, and to download the distribution please go to: The Apache SystemML Team

      0 0

      The Spark Technology Center team has just released SystemML 0.8.0. SystemML 0.8.0 is the first binary release of SystemML since its initial migration to GitHub on August 16, 2015. This release represents 320+ patches from 14 contributors since that date. SystemML became publicly available on GitHub on August 27, 2015. Extensive updates have been made to the project in several areas. These include APIs, data ingestion, optimizations, language and runtime operators, new algorithms, testing, and online documentation. APIs Improvements to MLContext and to MLPipeline wrappersData Ingestion Data conversion utilities (from RDDs and DataFrames) Data transformations on raw data setsOptimizations Extensions to compilation chain, including IPA Improvements to parfor Improved execution of concurrent Spark jobs New rewrites, including eager RDD caching and repartitioning Improvements to buffer pool caching Partitioning-preserving operations On-demand creation of SparkContext Efficient use of RDD checkpointingLanguage and Runtime Operators New matrix multiplication operators (e.g., ZipMM) New multi-threaded readers and operators Extended aggregation-outer operations for different relational operators Sample capabilityNew Algorithms Alternating Least Squares (Conjugate Gradient) Cubic Splines (Conjugate Gradient and Direct Solve)Testing PyDML algorithm tests Test suite refactoring Improvements to performance testsOnline Documentation GitHub README Quick Start Guide DML and PyDML Programming Guide MLContext Programming Guide Algorithms Reference DML Language Reference Debugger Guide Documentation site available at

      0 0

      You have heard about "Machine Learning", "Intelligent Computers" but has no idea how the machines learn ? Harvard Business Review have a good article, explaining in simple English, how this is accomplished.
      "Machine learning.” You’ve heard the term, and you probably nod in agreement when someone tells you how important it is. But secretly you may not be sure what it is or how it works. Ask your data scientists to explain, and you may get lost in a sea of specialist talk about forks, leaf nodes, split points, and recursions. Forget all that. The only thing you need to know is that machine learning applies statistical models to the data you have in order to make smart predictions about data you don’t have. Those predictions can help you find signals in the noise and extract value from all the data you’re collecting. The advantage of—indeed, the imperative for—using machine learning is its speed and brute force. It can mine vast swaths of data in seconds or minutes, find patterns, and make predictions in ways that no human analyst could begin to emulate. Machine learning is, among other things, helping companies to detect that patients will have seizures long before they actually occur.
      Now that you have a better understanding, look into implementing your own Machine Learning using SystemML.

      0 0

      Apache CXF Fediz 1.2.2 has been released. The issues fixed can be seen here. Highlights include:

      • The core Apache CXF dependency is updated to the recent 3.0.8 release.
      • A new HomeRealm Discovery Service based on Spring EL is available in the IdP.
      • Support for configurable token expiration validation in the plugins has been added.
      • Various fixes for the websphere container plugin have been added.
      A new feature in 1.2.2 is the ability to specify a constraint in the IdP on the acceptable 'wreply' value for a given service. When the IdP successfully authenticates the end user, it will issue the WS-Federation response to the value specified in the initial request in the 'wreply' parameter. However, this could be exploited by a malicious third party to redirect the end user to a custom address, where the issued token could be retrieved. In 1.2.2, there is a new property associated with the Application in the IdP called 'passiveRequestorEndpointConstraint'. This is a regular expression on the acceptable value for the 'wreply' endpoint associated with this Application. If this property is not specified, a warning is logged in the IdP. For example:

      0 0

      It's now over a month since I outlined my 2016 plans: get fitter, do more on testability.

      What's the progress?

      I've got a place on the 2016 Fred Whitton Challenge, which is widely regarded as the hardest one-day "fun" ride in the UK.

      Having done it in 2014, I am minded to concur

      That time 4 people got helicoptered off; I have no intention of qualifying for the same transport this year.

      Instead: I now have to get fit enough to do 110 miles up 30% gradients, when I'm currently fit enough to do 25-30 miles.

      In 10 weeks.

      0 0

      Someone recently asked me: What happens when I drop a column in CQL? With all the recent changes in the storage engine I took the opportunity to explore the new code. In short, we will continue to read the dropped column from disk until the files are rewritten by compaction or you force Cassandra to rewrite the files.

      The Setup

      To see what happens when columns are dropped I started a server using the tip of the Cassandra 3.0 branch and created the following schema using cqlsh:

      create keyspace dev WITH replication = {'class':'SimpleStrategy', 'replication_factor':1};
      use dev;
      create table foo (
           foo text primary key,
           bar text,
           baz text
      insert into foo (foo, bar, baz) values ('foo','this is bar', 'this is baz');

      I wanted to ensure I was observing what happens when we read from SSTables, rather than Memtables, so the next step was to flush to disk:

      $ nodetool flush

      Then confirm I had the expected behaviour: the foo, bar, and baz columns were returned:

      cqlsh:dev> select * from foo;
       foo | bar         | baz
       foo | this is bar | this is baz

      Great, now to drop the baz column (via cqlsh):

      alter table foo
      drop baz;

      And confirm the data is not there:

      cqlsh:dev> select * from foo;
       foo | bar
       foo | this is bar
      (1 rows)

      Note: If you restart the node after this point you will hit the issue described in CASSANDRA-11050. As a work around flush the system_schema.dropped_columns table before restarting the node:

      nodetool flush system_schema dropped_columns

      Down (almost) To The Disk

      The 3.0 storage engine is focused on rows. These are managed through the o.a.c.db.rows.Unfiltered interface, and (de)serialised by the o.a.c.db.rows.UnfilteredSerializer class. To get the o.a.c.db.rows.Row from the SSTable a call is made to UnfilteredSerializer.deserializeRowBody() which iterates over all the Cell’s in the Row. We know which Cells are encoded in the Partition by looking at the o.a.c.db.SerializationHeader and the Row flags. The Row will either have all the Cells encoded in the Partition or a subset, encoded as bit flag.

      Whichever process is used, we sequentially read the Cells for the query from the data stream. This is managed by UnfilteredSerializer.readSimpleColumn():

      private void readSimpleColumn(ColumnDefinition column, DataInputPlus in, SerializationHeader header, SerializationHelper helper, Row.Builder builder, LivenessInfo rowLiveness)
      throws IOException
          if (helper.includes(column))
              Cell cell = Cell.serializer.deserialize(in, rowLiveness, column, header, helper);
              if (!helper.isDropped(cell, false))
              Cell.serializer.skip(in, column, header);

      We will read the column if it is required by the CQL query, the call to helper.includes() checks this, otherwise we skip the data in the input stream. A query that reads all the cells, such as select * from foo above, will return True for all the Cells in the Row. Once we have the Cell in memory we then check to see if it was dropped from the Table.

      So the answer is, yes in some circumstances we will read dropped Cells from disk. Which may lead you to ask, why ?.

      For each unique Column dropped from a Table we keep a timestamp of when the drop occurred (CFMetaData.DroppedColumn):

      // drop timestamp, in microseconds, yet with millisecond granularity
      public final long droppedTime;

      Any Cells on disk created before this time were present when the Column was dropped and should not be considered. Any created after, which can only happen if the Column was re-added to the Table, should be.

      This says something interesting about the timestamp for Cells, which we will see below.

      Won’t Someone Think of The Performance!

      Of course the Cells that were on disk before we dropped the Column were read: the on disk files are immutable. Cassandra needs an reason to re-write the SSTables so it can filter out the Cells that represent deleted Columns. There are a few ways to do that, but first we need a way to verify the expected outcome. The sstable2json tool was removed (CASSANDRA-7464) in 3.0, however Andrew Tolbert and Chris Lohfink have created a handy tool that fills the gap sstabletools.

      Using sstable-tools we can check the contents of the one SSTable created above, and confirm that as we expect there is a Cell for baz:

      $ java -jar target/sstable-tools-3.0.0-SNAPSHOT.jar toJson data/data/dev/foo-e61a4890cd5311e59b10a78b2c43262c/ma-1-big-Data.db 
          "partition" : {
            "key" : [ "foo" ]
          "rows" : [
              "type" : "row",
              "liveness_info" : { "tstamp" : 1454819593731668 },
              "cells" : [
                { "name" : "bar", "value" : "this is bar" },
                { "name" : "baz", "value" : "this is baz" }

      A simple way to force a re-write is to run upgradesstables:

      bin/nodetool  upgradesstables --include-all-sstables dev foo

      And check the new file:

      $ java -jar target/sstable-tools-3.0.0-SNAPSHOT.jar toJson data/data/dev/foo-e61a4890cd5311e59b10a78b2c43262c/ma-2-big-Data.db 
          "partition" : {
            "key" : [ "foo" ]
          "rows" : [
              "type" : "row",
              "liveness_info" : { "tstamp" : 1454819593731668 },
              "cells" : [
                { "name" : "bar", "value" : "this is bar" }

      That’s great but not very practical. As you would expect though upgradesstables uses similar code paths to regular compaction, meaning that as your data is compacted Cells for dropped Columns will be purged from disk. This will work well for recent data that is still under active compaction, or when using the Levelled Compaction Strategy where data is more frequently compacted. A different approach may be needed for older data that is no longer compacted.

      Gardening Duty

      When it comes to actively purging Cells from disk for a Column you have dropped the first thing you will need to know is when the Column was dropped. This can easily be found via cqlsh:

      cqlsh:system_schema> select * from system_schema.dropped_columns;
       keyspace_name | table_name | column_name | dropped_time             | type
                 dev |        foo |         baz | 2016-02-07 05:26:10+0000 | text

      Then find the SSTables that were created before that date.

      Finally run a user defined compaction on the SSTables using the CompactionManagerMBean.forceUserDefinedCompaction() JMX operation. For example when using jmxterm:

      $ jmxterm
      Welcome to JMX terminal. Type "help" for available commands.
      $>open localhost:7199
      #Connection to localhost:7199 is opened
      $>bean org.apache.cassandra.db:type=CompactionManager
      #bean is set to org.apache.cassandra.db:type=CompactionManager
      #mbean = org.apache.cassandra.db:type=CompactionManager
      #class name = org.apache.cassandra.db.compaction.CompactionManager
      # attributes
        %0   - CompactionHistory (, r)
        %1   - CompactionSummary (java.util.List, r)
        %2   - Compactions (java.util.List, r)
        %3   - CoreCompactorThreads (int, rw)
        %4   - CoreValidationThreads (int, rw)
        %5   - MaximumCompactorThreads (int, rw)
        %6   - MaximumValidatorThreads (int, rw)
      # operations
        %0   - void forceUserDefinedCompaction(java.lang.String p1)
        %1   - void stopCompaction(java.lang.String p1)
        %2   - void stopCompactionById(java.lang.String p1)
      #there's no notifications
      $>run forceUserDefinedCompaction "/Users/aaron/code/apache/cassandra/data/data/dev/foo-13a1d880cd5b11e5a714a1b88fe46d8c/ma-7-big-Data.db"
      #calling operation forceUserDefinedCompaction of mbean org.apache.cassandra.db:type=CompactionManager
      #operation returns: 

      The operation only logs if it cannot find the file, you can follow the progress using nodetool compactionstats though.

      Dropping Columns And Timestamps

      Remember the check above to determine if a Cell was created before the Column was dropped? That only works if you are using actual time for the timestamp when inserting data. If you do nothing, that is you do not use the TIMESTAMP clause of the INSERT statement, the timestamp will be set to microseconds with millisecond precision via ClientState.getTimestamp(). Remember that phrase from above?

      // drop timestamp, in microseconds, yet with millisecond granularity

      The same scale of value is used when recording information dropped_columns table. If you are going to drop Columns from your tables TIMESTAMP’s must be microseconds past the epoch. This is mentioned in the documentation for the ALTER TABLE statement, but is worth emphasising. Back in the day we could say “the timestamp is a 64 bit int, which is microseconds past the epoch by convention”. Light Weight Transactions had a requirement that it be real time, but that was a special case. It’s now a general requirement that Cell timestamps are real time.

      0 0

      I finally got a chance last night to make it back to Austin’s Cassandra Meetup group and present a newer version of my Hardening Cassandra talk. I really like going to this group when I can, because we get a wide variety of skill sets and backgrounds. Austin is just eclectic like that!

      Unlike my experience with this topic in Seattle last month where we had a lot of active Cassandra users with little security knowledge, most folks in the room had security backgrounds, but were new to Cassandra and had been tasked with hardening a new cluster. It was a good fit and I’m pretty sure attendees got what they needed.

      If securing Cassandra is a topic of interest to you, please see some of my previous posts on logging exposure and node to node encryption. As always, feel free to tweet at me you have any questions on this.

      0 0
    • 02/19/16--15:31: Nick Kew: A Hollow Crown
    • Our prime minister returns triumphant from Brussels, his enemies vanquished.

      Or perhaps, he returns triumphant from annoying his friends, bringing with him ammunition for his enemies.

      Or does he play a double game against all of us?  But more on that later.

      His brief speech we heard on the radio news this evening actually sounded genuinely interesting in parts.  The story told in the media has been consistently different.  Doubtless both based on an element of truth and spun from there.

      The big story the media concentrate on (though what they say may not be entirely accurate) is about curbing benefits to migrants, on the face of it something entirely reasonable.  Or rather, something utterly preposterous: it’s only because our benefits system is monstrously broken that EU rules (accidentally) apply to it in the first place.   Germany, for example doesn’t have our “in work benefits” problem.  But instead of fixing it, he inflicts  gratuitous discrimination on (some) foreign workers, in the hope that one more wrong piled on to the mess might make a right.

      It’s supposed to reduce net migration.  That seems unlikely to happen.  Farage & Co are saying so, and the nutters are much more dangerous when they’re also right about an issue.  I expect Cameron will pull a rabbit or two from his hat to wrong-foot them ahead of the referendum, but this fundamental point won’t budge.  Two wrongs make an anti-right.

      Which brings me to the conspiracy idea: is Cameron in fact saying one thing but working for the opposite (as The Liar did over hunting)?  He has gerrymandered the electorate, conveniently setting aside a manifesto pledge to extend the vote to Brits long-term abroad (who may naturally have the strongest reasons to vote stay) and will also exclude EU citizens resident and working in the UK (ditto).  He’s promised everything the Europhobes asked for in terms of re-formulating the referendum question and terms of the debate, yet no word on conceding to the (europhile) SNP on the subject of the referendum date not clashing with their election.  In short, he seems in his actions to be working for an exit!

      Time will tell.  But on a personal level, should I get out now, ahead of a time when there might be serious barriers to a move?  Ugh.

      Oh, and if you pay more child benefit to children in the UK than in their home countries, doesn’t that risk incentivising foreign workers to bring their complete families?  So they burden our schools all the more, and become altogether more likely to remain here long-term or permanently. Unintended consequences, or misleading reporting?

      0 0

      I don't believe there is a Pulitzer Prize for software.

      But if there was such a prize, it should be given to the teams from RedHat and from Google who worked on CVE-2015-7547.

      Let's start the roundup by looking a bit at Dan Kaminsky's essay: A Skeleton Key of Unknown Strength

      The glibc DNS bug (CVE-2015-7547) is unusually bad. Even Shellshock and Heartbleed tended to affect things we knew were on the network and knew we had to defend. This affects a universally used library (glibc) at a universally used protocol (DNS). Generic tools that we didn’t even know had network surface (sudo) are thus exposed, as is software written in programming languages designed explicitly to be safe.

      Kaminsky goes on to give a high-level summary of how the bug allows attacks:

      Somewhat simplified, the attacks depend on:.
      • A buffer being filled with about 2048 bytes of data from a DNS response
      • The stub retrying, for whatever reason
      • Two responses ultimately getting stacked into the same buffer, with over 2048 bytes from the wire
      The flaw is linked to the fact that the stack has two outstanding requests at the same time – one for IPv4 addresses, and one for IPv6 addresses. Furthermore DNS can operate over both UDP and TCP, with the ability to upgrade from the former to the latter. There is error handling in DNS, but most errors and retries are handled by the caching resolver, not the stub. That means any weird errors just cause the (safer, more properly written) middlebox to handle the complexity, reducing degrees of freedom for hitting glibc.

      An interesting thing about this bug is that it was more-or-less concurrently studied by two separate security analysis teams. Here's how the Google team summarize the issue in their article: CVE-2015-7547: glibc getaddrinfo stack-based buffer overflow

      glibc reserves 2048 bytes in the stack through alloca() for the DNS answer at _nss_dns_gethostbyname4_r() for hosting responses to a DNS query.

      Later on, at send_dg() and send_vc(), if the response is larger than 2048 bytes, a new buffer is allocated from the heap and all the information (buffer pointer, new buffer size and response size) is updated.

      Under certain conditions a mismatch between the stack buffer and the new heap allocation will happen. The final effect is that the stack buffer will be used to store the DNS response, even though the response is larger than the stack buffer and a heap buffer was allocated. This behavior leads to the stack buffer overflow.

      The vectors to trigger this buffer overflow are very common and can include ssh, sudo, and curl. We are confident that the exploitation vectors are diverse and widespread; we have not attempted to enumerate these vectors further.

      That last paragraph is a doozy.

      Still, both of the above articles, although fascinating and informative, pale beside the epic, encyclopedic, exhaustive, and fascinating treatise written by Carlos O'Donell of RedHat and posted to the GNU C Library mailing list: [PATCH] CVE-2015-7547 --- glibc getaddrinfo() stack-based buffer overflow.

      O'Donell's explication of the bug is perhaps the greatest debugging/diagnosis/post-mortem write-up of a bug that I think I've ever read.

      If you've ever tried to precisely describe a bug, and how it can cause a security vulnerability, you'll know how hard it is to do that both exactly and clearly. Here's how O'Donell does it:

      The defect is located in the glibc sources in the following file:

      - resolv/res_send.c

      as part of the send_dg and send_vc functions which are part of the
      __libc_res_nsend (res_nsend) interface which is used by many of the
      higher level interfaces including getaddrinfo (indirectly via the DNS
      NSS module.)

      One way to trigger the buffer mismanagement is like this:

      * Have the target attempt a DNS resolution for a domain you control.
      - Need to get A and AAAA queries.
      * First response is 2048 bytes.
      - Fills the alloca buffer entirely with 0 left over.
      - send_dg attemps to reuse the user buffer but can't.
      - New buffer created but due to bug old alloca buffer is used with new
      size of 65535 (size of the malloc'd buffer).
      - Response should be valid.
      * Send second response.
      - This response should be flawed in such a way that it forces
      __libc_res_nsend to retry the query. It is sufficient for example to
      pick any of the listed failure modes in the code which return zero.
      * Send third response.
      - The third response can contain 2048 bytes of valid response.
      - The remaining 63487 bytes of the response are the attack payload and
      the recvfrom smashes the stack with it.

      The flaw happens because when send_dg is retried it restarts the query,
      but the second time around the answer buffer points to the alloca'd
      buffer but with the wrong size.

      O'Donell then proceeds to walk you through the bug, line by line, showing how the code in question proceeds, inexorably, down the path to destruction, until it commits the fatal mistake:

      So we allocate a new buffer, set *anssizp to MAXPACKET, but fail to set *ansp to the new buffer, and fail to update *thisanssizp to the new size.
      And, therefore:
      So now in __libc_res_nsend the first answer buffer has a recorded size of MAXPACKET bytes, but is still the same alloca'd space that is only 2048 bytes long.

      The send_dg function exits, and we loop in __libc_res_nsend looking for an answer with the next resolver. The buffers are reused and send_dg is called again and this time it results in `MAXPACKET - 2048` bytes being overflowed from the response directly onto the stack.

      There's more, too, and O'Donell takes you through all of it, including several other bugs that were much less severe which they uncovered while tracking this down and studying it using tools like valgrind.

      O'Donell's patch is very precise, very clearly explained, very thoroughly studied.

      But, as Kaminsky points out in today's follow-up, it's still not clear that we understand the extent of the danger of this bug: I Might Be Afraid Of This Ghost

      A few people have privately asked me how this particular flaw compares to last year’s issue, dubbed “Ghost” by its finders at Qualys.


      the constraints on CVE-2015-7547 are “IPv6 compatible getaddrinfo”. That ain’t much. The bug doesn’t even care about the payload, only how much is delivered and if it had to retry.

      It’s also a much larger malicious payload we get to work with. Ghost was four bytes (not that that’s not enough, but still).

      In Ghost’s defense, we know that flaw can traverse caches, requiring far less access for attackers. CVE-2015-7547 is weird enough that we’re just not sure.

      It's fascinating that, apparently due to complete coincidence, the teams at Google and at RedHat uncovered this behavior independently. Better, they figured out a way to coordinate their work:

      In the course of our investigation, and to our surprise, we learned that the glibc maintainers had previously been alerted of the issue via their bug tracker in July, 2015. (bug). We couldn't immediately tell whether the bug fix was underway, so we worked hard to make sure we understood the issue and then reached out to the glibc maintainers. To our delight, Florian Weimer and Carlos O’Donell of Red Hat had also been studying the bug’s impact, albeit completely independently! Due to the sensitive nature of the issue, the investigation, patch creation, and regression tests performed primarily by Florian and Carlos had continued “off-bug.”

      This was an amazing coincidence, and thanks to their hard work and cooperation, we were able to translate both teams’ knowledge into a comprehensive patch and regression test to protect glibc users.

      It was very interesting to read these articles, and I'm glad that the various teams took the time to share them, and even more glad that companies like RedHat and Google are continuing to fund work like this, because, in the end, this is how software becomes better, painful though that process might be.

      0 0

      (The title is of course from the gentle send-up of loony fringe politics in the Life of Brian).

      Our referendum doesn’t just have two opposing sides, it has a bunch of opposing teams on the “out of Europe” side.  Not to mention opposing views among them of what Britain might look like and what direction it might take outside the EU.

      That leaves our electoral commission with a bit of a dilemma.  Some horribly unfit-for-purpose rules say it has to hand various resources – like public money and TV airtime – to each side in the campaign.  In order to do so, it seems to have to designate one of those “out” groups as the official campaign, at the expense of the others.  That’ll leave the losers crying foul.

      Here’s a plea to them.  Give it to Farage’s lot.

      Farage will be insufferable anyway.  Not that I can really blame him in the circumstances: this is the consummation of his entire political career.  And he’s media-friendly: he’ll get  more airtime than pretty-much anyone else regardless of the electoral commission’s decision.  And he’ll tell bigger and more blatant porkies than the mainstream politicians, with a straight face.

      If he doesn’t get the money, he’ll not just be ubiquitous, he’ll have a real grievance.  That might in itself make him as unstoppable as Trump: the more outrageous he gets, the more popular it’ll make him.  Better he have the rope to hang himself than to hang the country.

      0 0

      While in Sydney I got the opportunity to catch up with some Australian Cassandra users at the Sydney Cassandra Meetup. Adoption down under has been a bit behind the ball. That seems to be changing with the presence of DataStax and InstaClustr, and a catching on of the global trend in finances and the top-end of businesses.

      The evening was hosted in The Rocks at RoZetta Technology underneath the Sydney Harbour Bridge. I had the honours of being the only presenter for the evening and to talk for close to an hour on Taking the Pain out of Repairs. It was the topic we (TLP) thought would best suit the crowd, largely people still getting the hang of Cassandra in production. I enjoyed being back in the city I grew up in and the crowd was a great bunch, warm and relaxed with plenty of questions and discussion.

      The slide deck is available, and I hope to write up a longer blog post with more of the details soon. Of course in the meantime feel free to tweet me you have any questions.

      And a big Thank You for RoZetta Software, Sirca and DataStax for hosting the night.

      0 0

      Systems integration has been a tough problem from the beginning of the use of software systems in the enterprises. This had taken many faces in the last coupe of decades and is still evolving as other factors of the computing.

      At the very beginning people used to design heterogeneous monolithic systems to automate certain parts of their business and the integration back then was mostly a manual process where there was a lot of human intervention in the process. This came in the form of data entry from one system to the other or the use of data export and import. With the invention of the internet people came up with point-to-point connections where each system has to be modified to be integrated with the other system. These interactions were mostly file based however more serious integration tasks used the queue based messaging which has provided reliability over the file based integrations approach.

      Then came the web era and people wanted to expose at least a tiny piece of their systems to be available over the web introducing public user interaction into the picture. Which resulted in the introduction of J2EE like technologies to facilitate most of these systems to be exposed over the web and the systems tend to be following the client-server architecture. However the systems integration at the server side was still having the same set of problems.

      The next step was a major step towards successful integration of the systems. With the advancement of distributed computing, the concept of services came into the picture and a new paradigm was introduced. This is named the "Service Oriented Architecture", commonly known as SOA. While this software paradigm theoretically solved most of the issues, the legacy systems that have been in use was causing problems to bottom up implementation of the SOA. While these services (mostly called web-services) are then connected to a hub like architecture, it also tried to resolve the gap between the web-services and the legacy systems by doing protocol conversion and format conversion. This is the "Enterprise Service Bus" (commonly known as ESB) which is still being used to solve the integration issues in between systems. It provides the location transparency and many other quality of service improvements to the integration. We (AdroitLogic) has a product offering for this which is named the UltraESB.

      Even though the web services technologies seem to give a promising solution, the weight associated with the SOAP infoset based on XML used by web services and the WS-* stack lead people to think of alternative solutions. REST had later used as a framework for developing a major share of services and the JAX-RS like specifications helped its way for the REST to reach the enterprise over protocols like JSON, TEXT and POX, though there were constraints of this technology such as the transport used and lack of application level security and reliability. The technologies such as Protocol Buffers, Thrift, Hessian, AS2 and other forms are also used within specific domains and they had been standardized. While AdroitLogic has a separate product offering for the AS2 communication named the AS2 Gateway the rest of the technologies are generically supported by the UltraESB.

      The same way enterprises wanted a piece of their system functionalities to be exposed as web form front ends to the users, now the enterprise are interested in sharing some of their services over the internet. This brought the concerns of security to these exposed services. These are identified as APIs that an organization wants to expose out with security, governance and many other centrally available features. API Management solutions addressed this and the product that we came up with for this domain is API Director.

      We are in the phase of introduction of an another new software paradigm and transition of the services integration, which is known as Micro Services.
      On a separate note cloud computing was something that evolved hand-in-hand with systems integration and the services exposure and in the mean time the basic component of cloud computing the virtualization has taken a revolutionary step to provide process oriented containers emulating an O/S yet utilizes the resources more effectively compared to a virtual machine. The integration had to evolve to facilitate this movement and the requirement of a container driven lean, distributed, highly scalable solution with the ability to easily provision new integration flows has to invented. This concept spans from integration of services, APIs and partners to front-end, users and devices including mobile devices, and is named the "Integration Platform".

      Yes, we are now working on getting a product offering out to address this, which runs on top of a set of Docker containers, utilizing the Kubernetes container management platform.

      Stay tuned to hear more about this amazing piece of software that will change the way you look at the problem of systems integration!!

      0 0

      Windows never ceases to disappoint me. I've got an external hard drive, on which resides a 60 GB VM, which I need to copy now and then to emulate a snapshot. So far, I used the standard Windows file copy stuff to do that. (Copy, and paste the VM directory.) Ran with about 8-20 MB per second. Or in other words: Took up to two hours. And, worst of all, the procedure isn't reentrant. Interrupting the copying means to restart.

      So, tried something different: CygWin'srsync, the swiss army knife for backups, and related stuff. In other words, I am using the command

        rsync -a -r --progress
      Runs with 30MB per second (50% faster than native Windows copying). And is reentrant...

      0 0

      Website Brand Review of Apache Flex

      Many projects come to Apache from software vendors donating them to the Apache community, where the Apache Incubator works to form an open and independent community around the project. Here, Adobe donated both the code and the brand for their Flex project to Apache. Now, the ASF is the steward both to the vibrant Apache Flex community, as well as the new owner of the Flex brand and registered trademark.

      Here’s my quick review of the Apache Flex project, told purely from the point of view of a new user finding the project website. While we’re all familiar with Adobe Flash browser plugin, not everyone may be familiar with the Flex environment for building Flash (and other!) applications.

      What Is Apache Flex?

      Apache Flex® is the open-source framework for building expressive web and mobile applications.

      In other words, Flex is a toolkit for building general applications that can be run on a variety of web browsers and mobile platforms that include the Adobe Flash or Adobe AIR runtimes or application containers. Flex is the coding language and environment you use to write applications for the Flash/AIR containers.

      No, Really, What Is Apache Flex For?

      Flex is an SDK – or software development kit. Flex is the code libraries, compiler, and other tools that a developer uses to build applications for the Flash or AIR runtimes. Flex does not include an IDE; while you can hand-write Flex code with your favorite editor, any of several popular IDEs make the build process much easier.

      Developers can write in MXML layout language to define documents or screens, including responsive applications for differing screen or presentation modes or features. You use ActionScript to write code for application logic. The Flex SDK includes a wide variety of APIs for accessing features of the target runtime, including mobile device features like GPS, accelerometers, cameras, and the like.

      Flex takes your application code and compiles it into an SWF file, which can then run in a target Flash or AIR runtime on a wide variety of platforms or devices.

      The Flex project includes a variety of other helper and testing tools, as well as a complete automated unit test suite.

      New User Website Perceptions

      That is, what does a new user see “above the fold” when first coming to the Apache Flex project homepage? For their first impression, is it easy to find things, and is the design appealing and easy to follow?

      The homepage is professionally laid out, with integrated design, graphics, fonts, a colorful but useful carousel, and prominently positioned social media links. Key Documentation, Download, and other links are prominent. The design is consistent across the bulk of the top level links; API reference documentation and the like uses a different, but still pleasant design. The main site footer in particular includes useful About and Subscribe blurbs, including a News ticker.

      The main Download for the SDK Installer (what new developers would start with) unfortunately uses a Flash control to begin the download, which some users may have blocked by their web browsers. Since the primary target of use is Flash or Air applications, this makes sense – and there’s a direct link to how to install without Flash on that page.

      Pointers to documentation, including Getting Started and How To Get Involved are nicely written and include plenty of details. In particular, the homepage and navbar include plenty of links to introductions and training/learning materials and classes for building Flex applications. While the Flex environment can have a lot of complexity, there are plenty of pointers to good educational materials by the project community and by Adobe. Similarly, there are many examples of existing projects built with Flex – there are plenty of designers in this community as well as coders!

      Perhaps I’m getting old, but I found the default font size for body text on the site to be quite small and light in color, making it a more difficult read.

      Apache Branding Requirements

      Apache projects are expected to manage their own affairs, including making all technical and content decisions for their code and websites. However to ensure a small modicum of consistency – and to ensure users know that an Apache project is hosted at the ASF – there are a few requirements all Apache projects must include in their websites (or wikis, etc.)

      • Apache Flex is used consistently, and is carefully ® attributed appropriately.
      • Website navigation links (except **not** Security!) to ASF pages included in the site’s navigation system.
      • Logo includes ™ footers include trademark attributions and a privacy policy link.
      • DOAP file exists and includes latest release.
      • A Community – Third Party Tools page includes a categorized list of a wide variety of third party tools related to building, using, and testing Flex projects.
      • Homepage includes prominent section discussing the Apache license, with a link to a page listing the license and trademark policies. (Note: seems like the about-licensing.html page needs tweaking)

      SEO / Search Hits / Related Sites

      Well, SEO is far outside of our scope (and debatable in usefulness anyway), but it’s interesting to see: how does a new user find the Apache Flex homepage when they were searching?

      Searching for “Flex” (a common word, so we might expect a lot of other hits):

      Top hit: varies – either unrelated hits, or Adobe’s information page about Flex
      Other top hits: Adobe page, wikipedia page, project homepage, other sites.

      Searching for “Flex software”:

      Top hits are typically either the Adobe page or wikipedia page. The Apache project homepage is on the first page of results, and a variety of other informational pages (some about unrelated FLEX software) are also found.

      Social Media Presence

      The Flex project has a notable official social media presence linked prominently on the homepage.

      What Do You Think Apache Flex Is?

      So, what do you think? Is Flex still relevant in the age of HTML5, endless JavaScript frameworks, and the move away from custom runtimes and tooling to simpler code for interactive features? Will you be more interested in Flex once it can compile your ActionScript directly to HTML/Javascript in the browser without the Flash plugin?

      Note: I’m writing here as an individual, not wearing any Apache hat. I hope this is useful both to new users and to the Apache Flex community, not necessarily a call to change anything. I haven’t used Flex for any real deployments myself, so please do comment with corrections to anything I’ve messed up above!

      0 0

      The rain is over. Bummer. It was good while it lasted.

      • Disks for Data Centers: White paper for FAST 2016
        We believe it is time to develop, jointly with the industry and academia, a new line of disks that are specifically designed for large scale data centers and services.
      • Not-quite-so-broken TLS: lessons in re-engineering a security protocol specification and implementation
        On the surface this is a paper about a TLS implementation, but the really interesting story to me is the attempt to ‘do it right,’ and the techniques and considerations involved in that process. The IT landscape is littered with bug-ridden and vulnerable software – surely we can do better? And if we’re going to make a serious attempt at that, where better than something like a TLS stack – because bugs and vulnerabilities there also expose everything that relies on it – i.e. pretty much the whole of the internet.
      • A Critique of ANSI SQL Isolation Levels
        The ANSI SQL isolation levels were originally defined in prose, in terms of three specific anomalies that they were designed to prevent. Unsurprisingly, it turns out that those original definitions are somewhat ambiguous and open to both a strict and a broad interpretation. It also turns out that they are not sufficient, since there is one even more basic anomaly not mentioned in the standard that needs to be prevented in order to be able to implement rollback and recovery. Looking even more deeply, the paper uncovers eight different phenomena (anomalies) that can occur, and six different isolation levels.
      • Google's Transition from Single Datacenter, to Failover, to a Native Multihomed Architecture
        The main idea of the paper is that the typical failover architecture used when moving from a single datacenter to multiple datacenters doesn’t work well in practice. What does work, where work means using fewer resources while providing high availability and consistency, is a natively multihomed architecture
      • The Deactivation of the American Worker
        The job terminations, like the bulk of the media outlet’s work, were first experienced by most Gawker employees in digital, rather than physical space. Deleting the accounts was merely the company’s attempt to assert control of its office space, and Slack’s role in the layoffs simply exemplified where work was actually being done; it also serves as an indicator of, for many employees in the coming years, where it will end.
      • The Absurdity of What Investors See Each Day
        What happened is when NYSE first allowed [traders] to collocate in the [same building], people started to get into pissing matches over the length of their cables. Just to give you an idea, a foot of cable equates to one nanosecond, which is a billionth of a second. People were getting into pissing matches over a billionth of a second.


        NYSE measured the distance to the furthest cabinet, which is where people put their servers. It was 185 yards. So they gave every [high-frequency trader] a cable of 185 yards.

        Then, traders who were previously closer to the [exchange server] asked to move to the farthest end of the building. Why? Because when a cable is coiled up, there's a light dispersion that is slightly greater than when the cable is straight.

      • President Obama Announces His Intent to Nominate Carla D. Hayden as Librarian of Congress
        She began her career with the Chicago Public Library as the Young Adult Services Coordinator from 1979 to 1982 and as a Library Associate and Children’s Librarian from 1973 to 1979.

      0 0

      0 0

      I recently had to remove a disk from all the Apache Cassandra instances of a production cluster. This post purpose is to share the full process, optimising the overall operation time and reducing the down time for each node.


      This post is about removing one disk by transferring its data to one other disk, the process will need to be modified to remove more than one disk or to move data to multiple disks. All the following operations can be run in parallel, except the last step which is running the script, as it involve restarting the node.

      There are three directories we need to consider:

      • old-dir refers to the folder mounted on the disk we want to remove.
      • tmp-dir is a folder we will temporary use for the operation needs.
      • new-dir is the existing data folder on the disk we want to keep.

      The ‘natural’ way

      The rough and natural way of doing this is:

      1. Stop one node.
      2. Move the SSTables from the old-dir to the new-dir.
      3. Change the data_file_directories in cassandra.yaml to mirror disk changes.
      4. Restart the node.
      5. Go to the next node and repeat the same steps.

      Very simple, isn’t it?

      Well it is as simple as it is inefficient. Lets consider we have 10 files of 100GB each on the disk to remove, on each node of a 30 nodes cluster, under a data folder called old-dir. Let’s also consider it takes 10 hours to move the 10 files.

      Then using the rough way of processing, nodes will be down for 10 hours each, and the operation will take very long:

          30 nodes * 10 h = 300 hours / 12.5 days

      This is a very long time on a running production cluster, with probably more operations waiting in the TODO list. It will increase linearly if there is more data or more nodes. Also, if you might not be there after 10h and do the work on the day after, making this twice as long as it could theoretically be. This is clearly not the best way to go, even if it ‘works’.

      Plus, as nodes will be down for more than 3 hours (default max_hint_window_in_ms), hints will no longer be stored for the node, meaning a full node repair will be needed every time a node comes back online, increasing substantially the overall operation time.

      Let’s not do that.

      The (most?) efficient way

      The main idea behind the process I will describe is that the mv command is an instant command if it is run from the same physical disk. The mv command will indeed not move the inode representing the file but just links pointing to it. This way moving Petabytes of data takes less than a second.

      The problem is the mv command will need to physically copy the data between disk as our source and destination directories are on different disks. That’s why it is relevant to first rsync data from old-dir to tmp-dir (tmp-dir being in the same disk as new-dir).

      Copying (not moving) data to a temporary folder outside from Cassandra data files allows us to run the the copy in parallel in all the nodes, without shutting them down.

      Way to go

      Make sure to run this procedure, at least every rsync, using a screen. This is a best practice while running operations to avoid any unexpected network hiccup to interfere with the procedure. It also allows teammates to take over easily.

      1. Make sure there is enough disk space on the target disk for all the data on the old-dir.

      2. First rsync

         sudo rsync -azvP --delete-before <old-dir>/data/ <tmp-dir>

        Explanations: First rsync to tmp-dir from old-dir. This can be run in parallel in all the nodes though. Options in use are:

        • -a: Preserves permissions, owner, group…
        • -z: Compress data for transfer.
        • -v: Gives detailed informations (Verbose)
        • -p: Shows progress.
        • --delete-before: Removes any existing file in the destination folder that is not present in the source folder.
        • Bandwidth used by rsync is tunable using the --bwlimit options, see the man page for more information. A good starting value could be the stream_throughput_outbound_megabits_per_sec value from cassandra.yaml which defaults to 200. Depending on the network, the bandwidth available and the needs, it is possible to stop the command, tune the --bwlimit and restart rsync.

        Example: This takes about 10 hours in our example as we are moving the same dataset as in the ‘natural’ way of doing this described above. The difference is we can run this in parallel on all the nodes as we can control bandwidth and there is no need for any node to be down.

      3. When first sync finishes, optionally disable compaction and stop compaction to avoid files to be compacted and so transferring the same data again and again. This is an optional step as there is a trade off between the amount of down time required later and keeping compaction up to date. If your cluster is always behind in compaction you may want to skip this step.

         nodetool disableautocompaction
         nodetool stop compaction
         nodetool compactionstats

        Explanations: These commands disable any additional compactions from starting then stop the compactions which are already running. The purpose of this is to make the old-dir file totally immutable so we just have to copy the new data.

        Warning: Keep in mind Cassandra won’t compact anything in the period between this step and the restart of the node. This will impact the read performances after some time. So I do not recommend doing it before the first rsync as we don’t want the cluster to stop compacting for too long in most cases. If the dataset is small, it should be fine to disable/stop compactions before the first rsync. On the other hand, if the dataset is big and very active it might be a good idea to perform multiple rsync before disabling compaction, to mitigate this, until size of tmp-dir is close enough to old-dir size. This basically makes the operation longer, but safer.

        Example: In our example, let’s say one compaction triggered during the first rsync, before we disabled it. So we now have 6 files of 100 GB and 1 of 350 GB. The problem is there is now a new file of 350 GB and rsync does not know this is the same data as in the 4 100 GB files already present in tmp-dir. Disabling compaction will avoid this behavior after the next rsync.

      4. Place a script we will later use on the node:

         curl -Os

        The script will need to be executable, and there are two user variables to configure:

         chmod u+x
         vim # Set 'User defined variables'
      5. Second rsync

         sudo rsync -azvP --delete-before <old-dir>/data/ <tmp-dir>

        Explanations: The second rsync has to remove the files that were compacted during the first rsync from tmp-dir as compactions were not yet disabled. It is good to use the ‘–delete-before’ option, which keeps Cassandra from compacting more than is needed once we will give it the data back. As tmp-dir needs to be mirroring old-dir, using this option is fine. This second rsyncis also runnable in parallel across the cluster.

        Example: This new operation takes 3.5 hours in our example. At this point we have 950 GB in tmp-dir, but meanwhile clients continued to write on the disk.

      6. Third rsync to copy the new files.

         sudo du -sh <old-dir> && sudo du -sh <tmp-dir>
         sudo rsync -azvP --delete-before <old-dir>/data/ <tmp-dir>

        Explanations: Existing files are now 100% immutable as they have never compacted. Now, we just need to copy new files that were flushed in old-dir as Cassandra is still running. Again, this is runnable in parallel.

        Example: Let’s say we have 50 GB of new files. It takes 0.5 hours to copy them in our case.

      7. Remove old-dir from the data_file_directories list in cassandra.yaml.

         sudo vim /etc/cassandra/conf/cassandra.yaml
      8. Run the script (Node by node !) and monitor

         sudo tail -100f /var/log/cassandra/system.log


        • The script stops the node, so should be run sequentially.
        • It performs 2 more rsync:
          • The first one to take the diff between the end of 3rd rsync and the moment you stop the node, it should be a few seconds, maybe minutes, depending how fast the script was run after 3rd rsync ended and on the throughput.
          • The second rsync in the script is a ‘control’ one. I just like to control things. Running it, we expect to see that there is no diff. It is just a way to stop the script if for some reason data is still being appended to old-dir (Cassandra not stopped correctly or some other weird behavior). I guess this could be replaced/completed with a check on Cassandra service making sure it is down.
        • Next step in the script is to move all the files from tmp-dir to new-dir (the proper data folder remaining after the operation). This is an instant operation as files are not really moved as they already are on the disk as mentioned earlier.
        • Finally the script unmount the disk and remove the old-dir.

        Example: This will take a few minutes depending on how fast the script was run after the last rsync, the write throughput of the cluster and the data size (as it will impact Cassandra starting time). Let’s consider it takes 6 minutes (0.1 hours).


      So the ‘natural’ way (stop node, move, start node) in our example takes:

      10h * 30 = 300h

      Plus, each node is down for 10 hours, so nodes need to be repaired as 10 hours is higher than hinted handoff limit of 3 hours (default).

      The full ‘efficient’ operation, allowing transferring the data in parallel, takes:

      10h + 3.5h + 0.5h + (30 * 0.1h) = 17h

      Nodes are down for about 5-10 min each. No further operation needed.

      0 0

      • Maglev: A Fast and Reliable Software Network Load Balancer

        Maglev is Google’s network load balancer. It is a large distributed software system that runs on commodity Linux servers. Unlike traditional hardware network load balancers, it does not require a specialized physical rack deployment, and its capacity can be easily adjusted by adding or removing servers. Network routers distribute packets evenly to the Maglev machines via Equal Cost Multipath (ECMP); each Maglev machine then matches the packets to their corresponding services and spreads them evenly to the service endpoints. To accommodate high and ever-increasing traffic, Maglev is specifically optimized for packet processing performance. A single Maglev machine is able to saturate a 10Gbps link with small packets. Maglev is also equipped with consistent hashing and connection tracking features, to minimize the negative impact of unexpected faults and failures on connection-oriented protocols. Maglev has been serving Google’s traffic since 2008. It has sustained the rapid global growth of Google services, and it also provides network load balancing for Google Cloud Platform.
        Something we argued for quite a lot in Amazon, back in the day….

        (tags: googlepaperscaleecmpload-balancingvia:conallmaglevlbs)

      • DIY DOG

        BrewDog releases their beer recipes for free. so cool! ‘So here it is. The keys to our kingdom. Every single BrewDog recipe, ever. So copy them, tear them to pieces, bastardise them, adapt them, but most of all, enjoy them. They are well travelled but with plenty of miles still left on the clock. Just remember to share your brews, and share your results. Sharing is caring.’

        (tags: brewinghomebrewbeerbrewdogopen-sourcefreesharing)

      • National Children’s Science Centre due to open in 2018

        Good for science fans, not so hot for real tennis fans.

        The former real tennis court building close to the concert hall’s north wing would be used for temporary and visiting exhibitors, with a tunnel connecting it to the science centre. The National Children’s Science Centre is due to open in late 2018 and will also be known as the Exploration Station, said Dr Danny O’Hare, founding president of Dublin City University and chairman of the Exploration Station board since 2006.

        (tags: real-tennistennisnchdublinsciencekidsplanetarium)

      0 0

      Apache CXF Fediz 1.3.0 will be released in the near future. One of the new features of Fediz 1.2.0 (released last year) was the ability to act as an identity broker with a SAML SSO IdP. In the 1.3.0 release, Apache CXF Fediz will have the ability to act as an identity broker with an OpenId Connect IdP. In other words, the Fediz IdP can act as a protocol bridge between the WS-Federation and OpenId Connect protocols. In this article, we will look at an interop test case with Keycloak.

      1) Install and configure Keycloak

      Download and install the latest Keycloak distribution (tested with 1.8.0). Start keycloak in standalone mode by running 'sh bin/'.

      1.1) Create users in Keycloak

      First we need to create an admin user by navigating to the following URL, and entering a password:

      • http://localhost:8080/auth/
      Click on the "Administration Console" link, logging on using the admin user credentials. You will see the configuration details of the "Master" realm. For the purposes of this demo, we will create a new realm. Hover the mouse pointer over "Master" in the top left-hand corner, and click on "Add realm". Create a new realm called "realmb". Now we will create a new user in this realm. Click on "Users" and select "Add User", specifying "alice" as the username. Click "save" and then go to the "Credentials" tab for "alice", and specify a password, unselecting the "Temporary" checkbox, and reset the password.

      1.2) Create a new client application in Keycloak

      Now we will create a new client application for the Fediz IdP in Keycloak. Select "Clients" in the left-hand menu, and click on "Create". Specify the following values:
      • Client ID: realma-client
      • Client protocol: openid-connect
      • Root URL: https://localhost:8443/fediz-idp/federation
      Once the client is created you will see more configuration options:
      • Select "Access Type" to be "confidential".
      Now go to the "Credentials" tab of the newly created client and copy the "Secret" value. This will be required in the Fediz IdP to authenticate to the token endpoint in Keycloak.

      1.3) Export the Keycloak signing certificate

      Finally, we need to export the Keycloak signing certificate so that the Fediz IdP can validate the signed JWT Token from Keycloak. Select "Realm Settings" (for "realmb") and click on the "Keys" tab. Copy and save the value specified in the "Certificate" textfield.

      1.4) Testing the Keycloak configuration

      It's possible to see the Keycloak OpenId Connect configuration by navigating to:
      • http://localhost:8080/auth/realms/realmb/.well-known/openid-configuration
      This tells us what the authorization and token endpoints are, both of which we will need to configure the Fediz IdP. To test that everything is working correctly, open a web browser and navigate to:
      • localhost:8080/auth/realms/realmb/protocol/openid-connect/auth?response_type=code&client_id=realma-client&redirect_uri=https://localhost:8443/fediz-idp/federation&scope=openid
      Login using the credentials you have created for "alice". Keycloak will then attempt to redirect to the given "redirect_uri" and so the browser will show a connection error message. However, copy the URL + extract the "code" query String. Open a terminal and invoke the following command, substituting in the secret and code extracted above:
      • curl -u realma-client:<secret> --data "client_id=realma-client&grant_type=authorization_code&code=<code>&redirect_uri=https://localhost:8443/fediz-idp/federation" http://localhost:8080/auth/realms/realmb/protocol/openid-connect/token
      You should see a succesful response containing (amongst other things) the OAuth 2.0 Access Token and the OpenId Connect IdToken, containing the user identity.

      2) Install and configure the Apache CXF Fediz IdP and sample Webapp

      Follow a previous tutorial to deploy the latest Fediz IdP + STS to Apache Tomcat, as well as the "simpleWebapp". Note that you will need to use Fediz 1.3.0 here (or the latest SNAPSHOT version) for OpenId Connect support. Test that the "simpleWebapp" is working correctly by navigating to the following URL (selecting "realm A" at the IdP, and authenticating as "alice/ecila"):
      • https://localhost:8443/fedizhelloworld/secure/fedservlet
      2.1) Configure the Fediz IdP to communicate with Keycloak

      Now we will configure the Fediz IdP to authenticate the user in "realm B" by using the OpenId Connect protocol. Edit 'webapps/fediz-idp/WEB-INF/classes/entities-realma.xml'. In the 'idp-realmA' bean:
      • Change the port in "idpUrl" to "8443". 
      In the 'trusted-idp-realmB' bean:
      • Change the "url" value to "http://localhost:8080/auth/realms/realmb/protocol/openid-connect/auth".
      • Change the "protocol" value to "openid-connect-1.0".
      • Change the "certificate" value to "keycloak.cert". 
      • Add the following parameters Map, filling in a value for the client secret extracted above: <property name="parameters">
                        <entry key="" value="realma-client"/>
                        <entry key="client.secret" value="<secret>"/>
                        <entry key="token.endpoint" value="http://localhost:8080/auth/realms/realmb/protocol/openid-connect/token"/>
      2.2) Configure Fediz to use the Keycloak signing certificate

      Copy 'webapps/fediz-idp/WEB-INF/classes/realmb.cert' to a new file called 'webapps/fediz-idp/WEB-INF/classes/keycloak.cert'. Edit this file + delete the content between the "-----BEGIN CERTIFICATE----- / -----END CERTIFICATE-----" tags, pasting instead the Keycloak signing certificate as retrieved in step "1.3" above.

      Restart Fediz to pick up the changes (you may need to remove the persistent storage first).

      3) Testing the service

      To test the service navigate to:
      • https://localhost:8443/fedizhelloworld/secure/fedservlet
      Select "realm B". You should be redirected to the Keycloak authentication page. Enter the user credentials you have created. You will be redirected to Fediz, where it converts the received JWT token to a token in the realm of Fediz (realm A) and redirects to the web application.

      0 0

      • Proportional Representation in Ireland: How it Works

        Excellent explanation of PR-STV and the Irish voting system. Don’t be a Plumper! (via John O’Shea)

        (tags: plumperspr-stvprvotingirelandpoliticsvia:joshea)

      • Microsoft warns of risks to Irish operation in US search warrant case

        “Our concern is that if we lose the case more countries across Europe or elsewhere are going to be concerned about having their data in Ireland, ” Mr Smith said, after testifying before the House judiciary committee. Asked what would happen to its Irish unit if the company loses the case or doesn’t convince Congress to pass updated legislation governing cross-border data held by American companies, the Microsoft executive said: “We’ll certainly face a new set of risks that we don’t face today.” He added that the issue could be resolved by an executive order by the White House or through international negotiations between the Irish Government or the European Union and the US.

        (tags: microsoftdataprivacyus-politicssurveillanceusa)

      • How To Implement Secure Bitcoin Vaults

        At the Bitcoin workshop in Barbados, Malte Möser will present our solution to the Bitcoin private key management problem. Specifically, our paper describes a way to create vaults, special accounts whose keys can be neutralized if they fall into the hands of attackers. Vaults are Bitcoin’s decentralized version of you calling your bank to report a stolen credit card — it renders the attacker’s transactions null and void. And here’s the interesting part: in so doing, vaults demotivate key theft in the first place. An attacker who knows that he will not be able to get away with theft is less likely to attack in the first place, compared to current Bitcoin attackers who are guaranteed that their hacking efforts will be handsomely rewarded.

        (tags: private-keysvaultsbitcoinsecuritycryptotheft)

      0 0

      This is a continuation of the video blogs I have done about our development on the fabric8 Camel tools.

      Today is friday so it was a chance to grab a beer and do a one take video recording. This time demonstrating how the Camel tools is able to from the cursor position in your Java editor such as IDEA or Eclipse to add or edit Camel endpoints in a type-safe way using a wizard. What is cool about the command is that you just put the cursor on the line with the endpoint to edit, or place the cursor where you want to add the endpoint.

      The video is 7 minutes long and shows Eclipse and IDEA where I edit two different Camel projects. One project is using XML for declaring the Camel routes, and the other is using Java. The tools work with both kind.

      So sit back and grab a beer and watch, or if you are in the office then maybe grab a coffee or tea.

      We are working on doing the same for the EIP patterns and got that working for XML, but the Java bits is still pending. But down the road what you see in this video would be possible to across the board in Camel.

      0 0

      한 때는, 구글의 기술을 카피 하던 아파치 하둡같은 것이 우리에게 굉장히 cutting-edge 최첨단 선진 기술로 느껴질 때가 있었다. 검색창 달랑 하나 있는 허접한 UI 뒤에 숨어 눈에 보이지 않기 때문에 일어난 현상이라고 생각한다. 과연 기술만 그랬을까? 화려한 UI로 장식한 유사 서비스는 경쟁의 가능성을 보여주는 듯 했다. 지금은 이제 카피조차 여러운 그들과의 기술 간극을 경험하고 있을 것이다.

       내 뇌리에 강렬하게 남아있는 네이버 이해진 CSO (아마도 '08년도 신년사) 한마디가 있는데, 대략 요약하면 구글 맵스/어스를 대중에 공개 하기 전 각국 인터넷 회사 대표를 초대하여 보여줬고 본인은 그곳에서 큰 충격을 받았다는 얘기다. 당시 그분은 격앙된 목소리로 임직원에 긴장할 것을 요구했으나 당시 공감한 자는 많지 않았을 것으로 본다.

       요즘 왕성히 활동하는 구글의 딥 마인드는 '13년으로 거슬러 올라간다. 인수를 위해 래리 페이지가 직접 행차 하였는데 과연 당시 그가 본 것은 무엇 이었을까? TED 2014에서 그가 말한 것처럼 아마 미래를 보았을거다. 현재는 거의 머신러닝에 몰빵하는 분위기인데, 내가 짐작컨데 우리가 보고 있는 것은 빙산에 일각이라는거다.

       아내에게 컴퓨터가 그림 그리는 법을 배우고 프로 바둑 기사를 상대하려고 하는 것에 대해 얘기해주었는데, 아내는 크게 놀라지 않고 의외로 순순히 받아들였다. 아마도 과거 포토샵 필터와 체스 게임 수준으로 생각하는 것 같은데 그림이나 작곡, 바둑 같은 분야에서 인간이 십수년에 거쳐 학습한 결과의 수준을 몇 시간만에 학습한다는 사실. 나의 설레임은 바로 여기서 온다. 그 수준이 아마추어가 아니라 프로 세계의 최고이기 때문에.

       한편, 빅 데이터 오픈소스 분야에서 개발한지 10년이 다 되어가고 기업 인공지능 팀에 속한 내가 한가지 확실히 말할 수 있는건, 이미 기술의 갭은 너무도 크다는거다. NIPS에 올라온 페이퍼를 보며 나 또한 알량한 엔니지어링 스킬 하나 믿고 범접할 분야가 아님을 뼈저리게 느꼈고, 도무지 돈과 인력 투입을 급조하여 밀어 부칠 수 있을 것 같지도 않았다.

       나의 두려움은 여기서 오는데 이러한 기술의 간극으로 예상되는 것은 미처 기술을 확보 못한 기업 및 개인의 도태다. 특이점이니 뭐니 2045년을 지목하는거 같은데 내 생각엔 좀 더 빠를 것으로 본다.

       우리 눈에는 그 변화가 서서히 오겠지만 1~20년 이내 급변할 것으로 나는 생각하고, 그 결과를 (좀 과하지만) 상상 해보면 인공지능과 로봇이 인간을 대체 하고 노동 인구 감소로 인한 문명의 퇴행을 피하면서 엘리트 소수 그룹만 쾌적한 지구 위에 유토피아를 건설한다는거다. 다른 말로 하자면, 중산층과 빈곤층의 종말이다. 망상이 과하다거나 당장 죽기야 하겠나 묻는다면, 난 지금 이시각 아프리카에서 죽어가는 아이들을 누가 신경 쓰던가 묻겠다.

      0 0

      I've been hanging out with other bits of the Bristol tech/dev world recently, rather than the usual big data suspects.

      Two weeks ago, I actually infiltrated the local Test meetup, which was entertaining, not just for the Bath Gem Ale which JustEat has on tap in their meeting area, but because I sit there, beer in hand, while speakers covered Applied Exploratory Testing, what its like turning up as a QA team to a new company and the patterns-of-untestability that you encounter (and how to start to get things under control).

      This week, I took a break from worrying about the semantics of flush() and its impact of durability of timeline 1.5 event histories. (i.e. why my incomplete apps aren't showing up in an ATS-backed  Spark History server if file:// is the intermediate FS of the test run). Wandered down to the Watershed Cinema with Tom White and a light hangover related to Tom White's overnight stay including an evening visit the Bravas Tapas Bar—and into a one-day dev conference, Voxxdev Bristol 2016.

      It was a good day. Oracle have been putting a lot of effort into the conf as a way raising visibility of what's going on in tech in the area to make more people aware the west of england a more interesting place to be than London, and with other companies and one of the local universities had put together a day-long conference.

      I was one of the speakers; I'd put in my Household Infosec talk, but the organisers wanted something more code-related, and opted for Hadoop and Kerberos, the Madness Beyond The Gate. I don't think that was the right talk for the audience. It's really for people writing code to run inside a Hadoop cluster, and to explain to the QE and support people that the reason they suffer so much is those developers aren't testing on secure clusters. (that's the version I gave at Cloudera Palo Alto last month). When you are part way into a talk and you realise that you can't assume the audience knows how HDFS works then you shouldn't really be talking to them about retrieving block-tokens from the NN from a YARN app handed off a delegation token in the launch context by a client authed against the KDC. Normally I like a large fraction of the audience to come out of a talk feeling they'd benefited; this time I'm not sure.

      I felt a bit let down the oracle big data talk, though impressed that people are still writing Swing apps. I was even more disappointed by the IoT talk, where he not only accused Hadoop of being insecure (speaker missed my talk, then), most of his slides seemed lifted from elsewhere: one a cisco IoT arch, one dell hadoop cluster design. Julio pointed out later that the brontobyte slide was one HP Labs have been using. Tip: if you use other's slides, either credit them or make sure the authors aren't attendees.

      I really liked some of the other talks. There was a great keynote by a former colleague, Dave Cliff, now at Bristol Uni, talking about what they are up to. This is the lecture series on their cloud computing course. 

      Cloud Computing Curriculum

      That's a big change given that in 2010, all they had was a talk Julio and I gave in the HPC course

      I might volunteer to give one of the new course's talks in exchange for being able to sit in on the other lectures (and exemption from exams, tutorials and homework, obviously)

      My favourite talk turned out to be Out of the Fire Swamp, by Adrian Colyer.

      Adrian writes "The morning paper"blog, which is something I'm aware of, in awe of and in fear of. Why fear? There's too many papers to read; I can't get through 1/day and to track the blog would only make it clear how behind I was. I already have a directory tree full of papers I think I need to understand. Of course, if you do read a related paper/day, it probably gets easier, except I'm still trying to complete [Ulrich99], as my and relate it to modern problems.

      Adrian introduced the audience to data serialization, causality and happens-before, then into linearalizability [HW90]

      Read Committed

      This was a really good talk.

      Expectations and Reality

      All the code we write is full of assumptions. We assume that n + 1 > n, though we know in our head that if n = 2^31 and its stored in a signed int32, that doesn't hold. (more formally, in two's complement binary arithmetic, for all registers of width w, n+1 > n only holds for all n where n < 2^(w-1).

      Sometimes even that fundational n+1 > n assumption catches us out. We assume that two assignments in source code happen in order, though in fact not only does the JVM reserve the right to re-order things, even, in the past, wrongly—and anyway, the CPU can reorder stuff as well.

      What people aren't aware of in modern datacentre-scale computing is what assumptions the systems underneath have made in order to give their performance/consistency/availability/liveness/persistence features or whatever it is they offer. To put differently: we think we know what the systems we depend on do, but every so often we find out our assumptions were utterly wrong. What Adrian covered is the foundational CS-complete assumptions, that you had really be asking hard questions about when you review technologies.

      He also closed with a section on future system trends, especially storage, with things like  non-volatile DIMMS (some capacitor + SSD to do a snapshot backup on power loss), faster SSD and those persistent technologies with performance between DRAM and SSD: looking at a future where tomorrow's options about durability vs. retrieval time are going to be (at least from some price points), significantly different from today's. Which means that we'd better stop hard coding those assumptions into our systems.

      Overall, a nice event, especially given it in the second closest place to my house where you could hold a conference (Bristol University would have been 7 minutes rather than 15 minutes walk). I had the pleasure of meeting some new people in the area working on stuff, including the illustrious James Strachan who'd come up from Somerset for the day.

      I liked the blend of some CS "Lamport Layer" work with practical stuff; gives you both education and things you can make immediate use of. I know Martin Kleppman has been going round evangelising classic distributed computing problems to a broader audience, and Berlin Buzzwords 2016 has a talk Towards consensus on distributed consensus  it's clearly something the conferences need.

      If there was a fault, as well as some of the talks not being ideal for the audience (mine), I'd say it's got the usual lack of diversity of a tech conference. You could say "well. that's the industry", but it doesn't have to be, and even if that is the case it doesn't have to be so unbalanced in the speakers. In the BBuzz even not only are there two women keynoting, we submission reviewers were renewing the talks anonymously: we didn't know who was submitting, instead going on the text alone.

      For the next Bristol conference, I'd advocate going closer in Uni/industry collaboration by offering some of the students tickets. Maybe even some of the maths & physics students rather than just CS. I also think maybe there should be a full strand dedicated to CS theory. Seriously: we can do causality, formality, set theory & relational algebra, paxos+ZK+REEF, reliability theory, graphs etc. Things that apply to the systems we use, but stuff people kind of ignore. I like this idea ... I can even think of a couple of speakers. Me? I'd sit in the audience trying to keep up.

      [HW90] Herlihy and Wing, Linearizability: A Correctness Condition for Concurrent Objects, 1990.

      [Ulrich99] Ulrich, A.W., Zimmerer, P. and Chrobok-Diening, Test architectures for testing distributed system, 1999. Proceedings of the 12th International Software Quality Week.

      0 0

      A little local coverage from one of my favorite websites: The Point of Diminishing Returns for Adult-Beverage Enthusiasts

      Once the fog deepens, nearly covering the western span of the Bay Bridge, the deserted base looks more like the abandoned movie set it was than the burgeoning home of alcohol innovation it has become.

      0 0

      Two weeks ago I awoke to the discovery that I can squeeze my butt again! Those of you who read my last blog post know that I have paralysis across my butt and down the outsides of my hamstrings and that in that post I said: 
      'Even if the movement of my feet does not return, I really wish that I could regain the feeling in my butt and the ability to squeeze the muscles so that I could build them back up again.'
      Well believe it or not, I got my wish! I could hardly believe it myself! I was still lying in bed on Sunday morning when I made the discovery. I was in such disbelief that I laughed out loud and woke up Janene. As she heard me she bolted upright, bleary eyed and said, 'Are you OK?' Still laughing I told her I can squeeze my butt and we both could hardly believe it. Even though it was a very small squeeze, it's a sign that the healing is starting to take place. 

      Technically the muscles in the butt are the gluteal muscle group as shown in the diagram comprised of the glueus maximus, gluteus medius and the gluteus minimus. The ability to squeeze these muscles is controlled by nerves that connect impulses sent from the brain, down the spinal column to the muscle to cause a contraction. The fact that the nerves have healed enough to allow me to squeeze them is a really good sign, it means that my body is healing itself. 

      The squeeze was very small and quite weak but it was a start. Because these muscles have basically been dormant for five months means that they are terribly atrophied and therefore extremely weak. But even in the two weeks since this movement returned, I have been working the muscles to build them back up and the squeeze has only increased. At this point, it's not a huge increase, but as my Mom always told me growing up, 'Slow and steady wins the race.' 

      Who thought I would be so happy for such a minor thing. But when I experienced such a devastating injury that forever changed my life, I learned very quickly to be happy for what I still have, as I have mentioned before. Now it's just a matter of working these muscles regularly via rehab to bring them back to life. Speaking of rehab, I also made a big change on this front last week. 

      Changing My Rehab 

      Since being released from Craig Hospital in June, I have been going back to Craig for rehab. After all, it is a world-renowned hospital for spinal cord and brain injuries. When I was first released from the hospital, my Dad was still in town and was driving me wherever I needed to go including to rehab at Craig Hospital. At first, I was going to rehab at Craig three times a week. It helped a lot to be in close contact with my physical therapist and to continue seeing my friends there. But it didn't take long for me to really get tired of making the 90 mile round trip and sitting in traffic for 2.5-3+ hours each time we made the trip. Remember, this was when I was still exhausted all the time and this drive only made things worse for me. I also got wise to the fact that insurance companies only pay for a certain number of visits. So I decided to keep doing my rehab at home and only check in with my PT at Craig once a week to more or less maximize my PT visits. For a while this worked, but because I am now back to work full-time, even making the trip to Craig once a week sunk a lot of time and I didn't get a lot of benefit from a one hour appointment once a week. So I began looking into other options including the Boulder Community Health's Mapleton Center for Outpatient Rehabilitation and also a company what specializes in spinal cord injury (SCI) rehab named Project Walk

      Project Walk was especially compelling to me because it focuses on rebuilding the muscle mass that SCI patients lose from the injury and hospitalization. The professionals at Project Walk help patients to design a workout specifically for them and their situation to focus on their own goals. My ultimate goal is to walk again without the need for braces and crutches, and although this is dependent upon my body and its ability to heal, there's a lot that can be done in the meantime to get my body ready for more movement to return. I applied to Project Walk and received a call back within a day and began talking to them. Everything sounded great and was very much in tune with the way that I have always enjoyed pushing myself in my physical fitness, but there was one catch -- they wanted me to come to their San Diego office for three weeks. The problem with this is that I am just too busy at work right now with recruitment duties for open positions and I don't feel like I can put this on hold for three weeks. Because of this, I decided to look into a more local solution in Boulder for now. 

      Boulder Community Health has an outpatient rehabilitation clinic called the Mapleton Center. A dear friend of mine who experienced a spinal injury a couple years ago went here for his rehab and told me that they really helped him. So I paid them a visit and got an evaluation by a PT who worked at the Spaulding Institute in Boston prior to coming to the Mapleton Center. Spaulding is a rehabilitation clinic out east that is well-known for its SCI program. So this week I began doing rehab at the Mapleton Center to see if this PT can help get me to get on the road to a more rigorous workout that will help me work toward my goals. This certainly doesn't mean that I have ruled out Project Walk, in fact, it is still very much on my mind. 

      In speaking to Project Walk, I have learned that this place is a premier rehab clinic for SCI patients. Based on 10 years of medical research and partnering with hospitals and universities, Project Walk is like not other rehab clinic I have discovered. And although they originally wanted me to come out there for three weeks, in speaking with them they suggested that perhaps we could condense it to a week and just work a lot more hours while I'm there. Furthermore, I also learned that they are opening a clinic in the Boulder/Denver area in March 2015. So I'm kinda thinking that I need to see how things play out at the Mapleton Center before traveling to Project Walk in San Diego. If I can attend PT in Boulder for a while and then go to Project Walk in San Diego, perhaps I can be ready to take on even more when the Project Walk clinic opens here in the Boulder/Denver area. 

      Dinner With Gareth and Mike 

      This past week I had dinner with my coworker Mike O'Donnell and his buddy Gareth who helped me as I laid suffering in the street right after the accident. Not only was it was wonderful to see Gareth again, but this time with a clear head, it was also great to have dinner with my co-worker Mike who I really like. I learned a lot about both Gareth and Mike that night and I really enjoyed our time together. Spending some time with Gareth in a social setting really clued me in to who he is and I discovered that we have a lot in common in terms of the way we look at the world. Gareth also told me about a fascinating book that I'm just beginning to read now. 


      Anyone who knows me knows that I'm constantly reading something. I'm always on the look out for new books to read and, in fact, I even keep a list of books in a notebook in Evernote (which I use for everything now). The book Gareth told me about is titled, Biology of Belief. This book is about how new research shows that our DNA does not control our biology, instead our DNA is controlled by our positive and negative thoughts. Certainly this topic is of extreme interest to me right now because of my medical situation. I don't have much to say about this topic yet because I haven't read the book yet. But suffice it to say that I am reading and trying everything I can get my hands on at this point to help heal myself. 

      When I told Janene about this book, she said it sounded similar to one recommended recently by a co-worker titled You Are the Placebo. This book is about how one's brain and body are shaped by their thoughts, their emotions and their intentions, not the other way around. Again, a captivating topic for me right now so I plan to read this book next. 

      Perhaps these two books will help me move from the hope of more movement to the real belief that I am going to get movement back and I am going to walk one day. After all, I did tell Project Walk that my goal is to walk, but my dream is to one day cycle and run again. 

      0 0

      Recently got some new leg braces to replace the old ones that kept breaking repeatedly.

      Every week to two, I would need to schedule an appointment to get the old braces fixed. I visited two different locations of a huge orthotics company and was told by folks both locations that they had never seen someone break the braces as often as I did. So they were always asking me what I was doing and my response was that I was just walking. But when they started asking me how much I was walking they began to understand why the braces kept breaking.

      Below on the left you can see the old braces -- they have a plastic food bed and calf support with aluminum struts along each side. The new braces are one continuous plastic mold.

      Old Braces New Braces

      The new braces have absolutely no flex to them whatsoever. By contrast, the old braces had enough joints that the large amounts of movement had kind of worked them somewhat loose giving them the feeling that there was at least some flex. In fact, with the old braces, I even broke the aluminum struts ... twice.

      The first break of a struts involved a big chunk of the aluminum just popping out while I was out on a walk around my neighborhood. This meant that the ankle on that side was free to move which my body was too weak to handle and I had to ask a neighbor for a ride home. The second strut break occurred while I was out walking and it just snapped in half. Although it took me a while, I was able to hobble home on my own. But then I had to cut a ruler in half to splint the strut and wrap it with a bunch of duct tape so that I could keep walking until they could order a new strut. These two breaks really surprised the orthotists!

      Now I am learning to walk in these new braces which is quite a challenge. I have more stability with the new braces but they are much more rigid. So the motion is different. At any rate, I am already happier with the new braces because they don't feel like they are going to fail at any moment.

      I am so thankful that I am fortunate enough to no longer be in the wheelchair, that I can actually walk, even if it is with arm crutches. Small improvements every day amount to big improvements over time. This is now my goal -- small, continuous improvements.

      0 0

      At the Neuschwanstein Castle outside of Munich
      Last month my family and I traveled to Germany for vacation and had a wonderful time. With so much history to see and things to experience, there was no shortage of activities to keep us busy.

      We started in a city where we had never been, Berlin. With a population of 3.5 million people, Berlin has an incredible number of activities to choose from which made it difficult because we only had a few days. We took a hop on/hop off bus tour around the city which was challenging for me but nevertheless pretty fun. When we realized that buses were basically staying in the same general area, we just decided to walk around the city and had fun checking out the city. We saw many sites around the city including Brandenburg Gate, the Memorial to Murdered Jews of Europe, the Topography of Terror and much more. One day we took the train and bus across the city to Waldhochseilgarten Jungfernheide. This place consists of a ropes course high in the forest that you climb and traverse while wearing climbing gear so that there's no chance of falling. Janene and the girls really enjoyed this adventure while I took many photos from the ground beneath them. Although it was very hot and we stayed in a hotel with no air conditioning, we still had a wonderful time relaxing and letting our curiosity drive us to sites and cafes all around the city. After Berlin, we took a train to the city of Halle about 1.5 hours south.

      Halle is where our au pair from about 10 years ago named Henriette and her husband Franz were both born and raised. We were lucky enough to spend some time with them and their little boy Gustav and also attend their wedding where we met many of their friends. We had such a good time and were so happy to see their wedding in-person rather than only view the photos after the fact. We also visited many historic places in Halle. Of course, no visit to Halle is complete without a visit to the oldest chocolate factory in Germany, the Halloren Chocolate Factory. Not only does it have amazing chocolates but it also has an interesting and varied history. We even ate döner kebab for the first time and even though we are not big meat eaters it was pretty delicious! We were also able to visit a friend of Henriette's named Franzeska, her husband Daniel and their little boy Johann. Franzeska was an au pair for some close friends of ours while Henriette was with us and so we got to know her during that time as well. After several days in Halle, we took a train about five hours south to Munich.

      At a cafe in Berlin
      After a long train ride we arrived in Munich, a city with which I am somewhat familiar being that my company is located there. Munich is such a different city compared to Berlin and we were told by many people that this is due to the differences between the old Eastern vs. Western Germany. Just as in Berlin, Munich is full of things to see and do. One day, we took a bus tour south of Munich to see Neuschwanstein Castle -- and Linderhof Castle, both of which were incredibly opulent and somewhat amazing for the time in which they were built. We did a tremendous amount of walking on this day around the castles. Despite my leg braces and using both arm crutches the whole time, I kept up with the tours pretty well. We also spent time in Munich just hanging around the city to see the sites. This also involved a lot of walking but I actually didn't mind it. I took breaks when I needed to and enjoyed the sites. One night we took the train across the city to the Osterwald bier garten in a different area of Munich to meet a friend for dinner. It was wonderful to see her and visit for the evening.

      Osterwald bier garten in Munich
      There is nothing like a true vacation where you unplug completely, forget about all of your responsibilities and don't worry about anything. This was one of those vacations. 

      After my spinal cord injury, it was really difficult for me to enjoy very much for a long time because I was so uncomfortable all the time. Taking this vacation was a true test for me both physically and mentally. I am happy to say that I did not feel left out, though I was jealous that my family got to spend one afternoon taking a bike tour around Munich. Still, this made me think that I can do many things that was not sure would be possible. Yet again, I feel so lucky to have the love and support of my family who have always stuck by me.

      0 0

      5 Things That Will Make Blood and Wine Great

      Although originally given a tentative release for the first quarter of 2016, like all good ambiguously dated things Witcher 3’s second and more sizable expansion, Blood and Wine, has been pushed back to the first half of 2016. Which does technically still include the first quarter, but come on. They wouldn’t be giving themselves those extra three months of leeway if they didn’t think they needed it.

      So while we may have to wait a bit longer for our next and seemingly final foray in to the life of Geralt of Rivia, we’ve still got enough nuggets of information to dive in, Scrooge McDuck style, and swim on the oceans of speculation until CD Projekt Red lock down dates, deadlines, and details for us all. To that end, and knowing what we know, here are five things that could be done in Blood and Wine that would make it great.

      Must. Find. Time. To. Complete. Pillars. Of. Eternity. And. Firewatch. Soon.

      0 0

      After nearly two years of no movement below my knees, I was pretty surprised to discover recently that there is movement in the achilles tendons on both legs!

      Back in late January, I went back to Craig Hospital for the annual re-evaluation of my spinal cord injury. During the four days of poking and probing my body, just like last year, they told me again, 'According to your internal organs, we can't tell that you have had a spinal cord injury.' This does not mean that I am unaffected, it means that I do not have the deterioration of internal organ function that they commonly see in the kidneys, liver, etc., etc. Many paraplegics are in a very bad way internally due to a whole host of issues that arise as side effects to the nerve damage. For example, I still have to deal with bladder and bowel issues from the nerve damage. Now, these issues have improved significantly from the time of the injury nearly two years ago, but I am also lucky enough that my injuries were not as bad as they could have been. (I could go on an on about the degrees of damage here but I will spare you the details.)

      There were two really positive discoveries during my re-evaluation, improvement sensation and some new movement. First was the discovery that I have much more sensation in my hamstrings, lower legs and feet than I realized. Nerves heal at such a slow rate that it's difficult to gauge the level of improvement on any given day. But if you have a baseline against which you can compare, then you can quantify the amount of improvement. I knew that the sensation had improved, but it was difficult to tell how much it had improved. Additionally, if you know anything about nerve sensation then you know that there different flavors of it -- basically soft/light touch to heavy touch to sharp touch (and everything in between). Interestingly, damaged nerves sensation can go from numb at one end of the scale to hypersensitive at the other end. So they tested most of my body for nerve sensation and were able to compare my results to one year ago and to the initial injury and the improvements are significant. It proved that my body is still healing itself. Well besides the improved sensation, there was another big discovery that I already mentioned.

      If you look at the two images above of the nerve dermatomes, pay attention to those from the waste down related to anything L3/L4 or below (this includes all the vertebrae in the lower spine L3, L4, L5, S1, S2, S3, S4 and S5). Right now I'm dealing with sensation issues in my feet, lower legs, hamstrings, tailbone and crotch areas. The great thing is that my body is still healing and nobody knows how far it will go over time.

      The second discovery was that the achilles tendons in both legs have some movement! There was just barely movement there, but it's movement nonetheless. It is very similar to the way that my butt/glutes returned -- it was such a minor amount of movement that I did not believe it at first. However, during my recovery, I have also learned that even minor movement can snowball over time into much more movement as the muscles are rebuilt. My glutes have largely returned, but they are still not 100% and won't be for some time, so I continue to work on rebuilding them. The same will be true for my calves. Right now my calves are basically gone due to the muscle atrophy. But in time as I move them I will enlist these muscles more and more and they will rebuild. In fact, in the 30 days or so since the movement in my achilles tendons was discovered, I can now feel that my calves are just beginning to engage. But only just beginning. It's going to take a lot more work over a long period of time before I can visibly produce dorsi- or plantar-flexion of my feet (which is driven by the calf muscles).

      As I have been telling everyone for the last year, in addition to waiting for my nerves to heal more, the big focus of my physical therapy has been fighting back against the severe amount of muscle atrophy that occurred as a result of my injuries. It took me a while to realize that the muscle atrophy was something from which I could recover (as long as there was movement). At first, I was so weak from the immobilization after the emergency surgery and spending nine months in a wheelchair that I just thought the weakness was caused by the paralysis. To a certain degree this was true, but what I have learned over time is that if I have movement I can rebuild the muscle. There is such a thing as having the movement return but perhaps not having 100% of the sensation return. Another positive thing is that I have experience dealing with muscle atrophy. Having dealt with this way back in high school with my first couple of knee injuries, I understand the dedication and hard work involved. It takes a lot of consistent, deliberate difficult work to rebuild muscles that have shrank away. As many people have asked me, 'Are you still going to PT?' and my response is, 'PT is a way of life for me, it won't stop for years.'

      Anyway, because the movement is starting to return, I was fitted for a new kind of brace (this is the third type of brace now). The type of braces I need are called ankle-foot orthotics or AFOs. The first type of AFOs I had were rigid hybrids comprised of aluminum struts with plastic food and calf beds. The second type of AFOs I had were rigid plastic with some carbon fiber reinforcement around the ankles -- rigid meaning no flexion whatsover. I am now on the third type of AFOs and these are known as dynamic response AFOs and they are made of carbon fiber.

      See the image to the right and notice that there is only a thin strip of carbon fiber along the back of the achilles tendon area. This strip is not only flexible for two reasons:

      1. To enlist the calf muscles in the stride
      2. To rebound once it is loaded from the ankle flexion so as to provide a more natural stride 
      Just like the previous change in AFOs, these new AFOs changed my stride again. It's much more natural and I don't need to pick up my feet as much because my ankles are flexing naturally. The downside is that I am pretty damn wobbly right now due to my lack of calf muscles, but this will improve over time. Also, my feet and lower legs are sore from the new material and they way it is squeezing me. I have already had a couple of adjustments to them and I will need more. But it is also a matter of your body getting used them, kinda like a new pair of shoes. Hopefully this will all improve over time.

      Someday I will post all of the videos that Janene has made of me walking at the different stages throughout my recovery. It's pretty amazing to see the progress so far and I'm not even done yet. As I tell myself quite often, never give up.

      0 0

      0 0
      0 0

      Apple device users have probably taken and stored 100 billion photos:

      • In early 2013, the number was 9 billion
      • There are 100 million iPhones in active use in 2015. If each iPhone takes 1000 pictures per year, that’s 100 billion photos in 2015 alone.
      • Photos are automatically backed up to iCloud since iOS 5

      I’d assumed that iCloud is a massive compute and storage cloud, operated like the datacenters of Google and Amazon.

      Turns out that, at least for photo storage, iCloud is actually composed of Amazon’s S3 storage service and Google’s Cloud Storage service. I serendipitously discovered this while copying some photos from my camera’s SD card to my Macbook using the native Photos app. I’d recently installed  ‘Little Snitch‘ to see why the camera light on my Macbook turns on for no reason. Little Snitch immediately alerted me that Photos was trying to connect to Amazon’s S3 and Google’s Cloud Storage:


      So it looks like Apple is outsourcing iCloud storage to two different clouds. At first glance this is strange: AWS S3 promises durability of 99.999999999%, so backing up to Google gains very little reliability for a doubling of cost.

      It turns out that that AWS S3 and Google Storage are used differently:

      Untitled 4Untitled 3

      For the approximately 200 hi-res photos that I was copying from my camera’s SD card, AWS S3 stores a LOT (1.58 GB), while Google stores a measly 50 MB. So Apple is probably using Google for something else. Speculation:

      AWS S3 has an SLA of 99.99%. For the cases where it is unavailable (but photos are still safe), Google can be used to store / fetch low-res versions of the Photo stream.

      The Google location could also be used to store an erasure code, although from the size, it seems unlikely.

      Apple charges me $2.99 per month (reduced from $3.99 per month last fall) for 200GB of iCloud storage. Apple should be paying (according to the published pricing) between $2.50 and $5.50 per month to Amazon AWS for this. Add in a few pennies for Google’s storage, they are are probably break-even or slightly behind. If they were to operate their own S3-like storage, they would probably make a small -to- medium profit instead. I’ve calculated some numbers based on 2 MB per iPhone image.

      per TB per month
      2 PB – 1 billion
      20 PB – 10
      billion photos
      200 PB – 100
      billion photos
      2000 PB – 1
      trillion photos
      -$5 -$10,000 -$100,000 -$1,000,000 -$10,000,000
      -$10 -$20,000 -$200,000 -$2,000,000 -$20,000,000
      $10 $20,000 $200,000 $2,000,000 $20,000,000
      $20 $40,000 $400,000 $4,000,000 $40,000,000

      Given Apple’s huge profits of nearly $70 billion per year, paying Amazon about a quarter a billion for worry-free, infinitely scalable storage seems worth it.

      I haven’t included the cost of accessing the data from S3, which can be quite prohibitive, but I suspect that Apple uses a content delivery network (CDN) for delivering the photos to your photo stream.


      Multi-cloud is clearly not a mythical beast. It is here and big companies like Apple are already taking advantage of it.