Are you the publisher? Claim or contact us about this channel

Embed this content in your HTML


Report adult content:

click to rate:

Account: (login)

More Channels

Channel Catalog

Articles on this Page

(showing articles 1 to 50 of 50)
(showing articles 1 to 50 of 50)

    0 0

    I've been playing around with react-native for a week or so, and I'm liking it.
    If you don't know what it is, its basically a toolkit for building mobile apps with, one that makes it easy to achieve good looking apps that integrate with your phone without you having to learn low-level details about how phones work (yay for that!).
    One of the great things is that, with Android at least, you can use your own favourite editing tools and the Android SDK and react-native will build your app, install it on your phone, start it running, and attach a debugger to the app and the developer tools in your web browser.
    All from one simple command:

    react-native run-android

    It even has a hot-patch and automatic re-load ability so that the running app updates as you edit your source files.
    But one thing has been bugging me like mad, though, there is a "secret" menu of developer options installed in the app, including an option to force a reload, but it requires a "rage-shake" to wake it on the phone.
    Or should I say it did.. because if you have your phone tethered with a USB to the Android Debug Bridge (adb) you can issue a keypress signal over adb and bind that to a menu or a keyboard shortcut in your computer. Yes, indeed, open a menu on the phone screen with a press of a key in your editor! All you have to do is to bind this command to a shortcut or key binding, and Bob's your uncle.:

    adb shell input keyevent 82 

    0 0

    Last Monday I atended the FOSS Backstage Micro-Summit and enjoyed it at a lot. I haven't been to ApacheCons for quite some time and it was good to see familiar faces and get to know so many new folks.

    The slides of my talk are on Speaker Deck but I'm afraid they aren't that useful without the things I actually said.

    The talks have been recorded, so chances are there will be videos soon.

    0 0

    One of my favorite conferences in the world is Devoxx Belgium. First of all, it tends to have one of the most enthusiastic audiences I've ever seen. Secondly, its organizers are super awesome and challenge you to give great talks. Third, it was the first conference I ever took my Trish to. In 2011, I took her a second time and proposed to her in Paris afterward.

    This year, I traveled to Devoxx Belgium for the first time without Trish. It was stressful because I didn't prepare well beforehand. However, it was also gratifying because I was able to make everything work, even it all happened at the last minute. Furthermore, I did the majority of my talks with good friends, which is always a pleasant experience.

    The purpose of this blog post is to document my experience this year, so I can look back and say WTF was I thinking?! ;)

    I left Denver on Monday (November 6) afternoon and flew to Brussels, Belgium. My flight landed in Brussels at 9 am and Josh and my (three hour) talk was at 1:30 pm. I made it in time, but it was one of the first times we didn’t have a lot of time to prepare face-to-face beforehand. I learned that getting t-shirts printed in the US to save $500 is a good idea, but having to take two suitcases to carry them all is a bad idea.

    Cloud Native PWAs with Josh Long at Devoxx Belgium We did our usual talk and I used Okta's new Angular SDK instead of the Sign-In Widget to showcase authentication. Even though the crucial step I needed was contained in my notes, I failed. One simple line to add an HttpInterceptor and I missed it!

    I think I followed up well with a tweet that showed how to fix it. But who knows how many people use Twitter. One things for sure, people tweet more at Devoxx Belgium than any other conference I’ve ever been too! In fact, the #Devoxx hashtag got hijacked by some porn sites and their tweets started showing up on the Twitter wall. ??

    I tweeted about what I forgot to do after our talk.

    Josh and my talk was published on YouTube the very next day, which is awesome.

    Tuesday night was the speaker’s reception, so I attended that and turned in around 10 pm. I worked on my next presentation (Angular vs. React) for a few hours after getting back to my hotel.

    On Wednesday, I worked all day with my co-speaker (Deepu, co-lead of JHipster) on our Angular vs. React presentation. We worked for eight hours at the conference venue that day and parted ways around 6 pm.

    On Wednesday night, I attended a dinner with Ray Tsang (Google Cloud Advocate). We were invited (along with Josh) to a dinner with JDriven. Josh couldn’t make it, but Ray and I attended and had a great time. I got home at 10 pm that night and worked on my next day’s presentation until 3 am.

    Thursday, I worked with Deepu for a couple hours to polish and practice our presentation and we delivered it that afternoon. We also advertised the t-shirts we brought.

    Angular vs React Smackdown with Deepu

    There were lots of tweets about our talk, but I think this from Daniel Bryant with our recommendations for Angular vs React was one of my favorites.

    Our session went well, even thought it wasn’t super technical, and it was published to YouTube.

    I also published our slides on Speaker Deck.

    We had the JHipster BOF late that night (during the conference movie) and only had three people show up. With five committers there, we still had a great time, and it was fun to give Julien the Duke’s Choice Award trophy since he started the project.

    Duke might've had a little too much to drink during our BOF. :D

    Duke at the JHipster BOF

    I thought our ratings (~4.2) for the two sessions were “good enough” to call the conference a success. Thanks to the conference organizers for delivering such an awesome experience once again.

    The Devoxx Belgium Team

    Devoxx Morocco

    I spent the weekend in Bruges and had a lovely time staying at an Airbnb and visiting some local breweries.

    BruggeBruggeKwak in Brugge

    Brugge by nightThe streets of Brugge

    Life is good in Bruges, and it just got a little bit better.

    On Saturday night, I worked for several hours on the Ionic module for JHipster that I needed for my talk at Devoxx Morocco. That’s where the (self-inflicted) drama began. Here’s the timeline of events that I documented in my presentation after my talk:

    • After Devoxx Belgium, tried to finish Ionic module over the weekend.
    • Late night of hacking, couldn’t figure out why what worked last week didn’t work this week.
    • Discovered Ionic “super” starter was upgraded to Angular 5 in the last week.
    • Realized I needed to version the starter or write my own.
    • Tried to make OAuth work, because Okta.
    • Discovered OAuth wouldn’t work, because JHipster implementation uses cookies, and Cordova’s web view won’t work with cookies.
    • Sunday evening (my talk was on Wednesday morning): refactored everything into an Ionic starter.
    • Monday: finished starter, couldn’t get it to work in iOS emulator because CORS.
    • Found bugs about CORS doesn’t work over http. Spent hours trying to make it work over https. Couldn’t get a local certificate to be trusted, couldn’t deploy a JHipster app to the cloud (b/c of slow wifi). Even tried cloud-to-cloud, but ran into frontend-maven-plugin on Linux issues.
    • Monday afternoon: discovered real issue was that emulator runs on port 8080. Changed JHipster/Spring Boot’s port to 9000, and it worked!
    • Tuesday: delivered talk on Cloud Native PWAs with Josh Long. (
    • Tuesday after dinner: started working on entity generator for Ionic.
    • Wednesday 4am: Got it working!
    • Wednesday 8-11:25am: wrote presentation.
    • Wednesday 11:30am: delivered talk, showed demo that worked!!

    I published the slides from "Developing PWAs and Mobile Apps with Ionic, Angular, and JHipster" to Speaker Deck.

    I also made a 5 minute video, because I recorded a lot of my development experience along the way.

    Phew! It was an exhausting couple of weeks. I learned something I already knew - you should have your presentation finished before you leave for the conference, especially when traveling overseas!

    Nevertheless, I had a great time. At Devoxx Belgium, it was announced that Josh and I both won Devoxx Champion awards. This award is given to speakers that attend all the Devoxx conferences in a year. Unfortunately, they never told either of us that we got it, so we missed it in the keynote. Luckily, it was recorded.

    At Devoxx Morocco, they notified me five minutes before the keynote that “I should come” and that they had a surprise for me. I was in the midst of my last-minute scramble to get code working and write my presentation, but I went anyway. I’m glad I did because it was a very cool opening keynote and I was honored to receive a Devoxx Champion award.

    Devoxx Morocco KeynoteDevoxx Morocco Keynote

    Devoxx Champion!

    I made sure to get my picture with Josh, and his girlfriend Tammie, after lunch.

    Devoxx Champions!

    There are two new Okta open source projects as part of my efforts, but they’ll require some polishing before they’re ready for general consumption. I hope to do that before the end of the year, but the end of January is probably more realistic. Below are links to their repos on GitHub:

    For more photos from these events, see my album on Flickr. Devoxx Belgium posted their photos to a Devoxx2017 album, as well as albums for each day: day 1, day 2, day 3, day 4, and day 5. Devoxx Morocco posted all of their photos in three separate albums: day 1, day 2, and day 3.

    I want to thank the organizers from Devoxx Belgium and Devoxx Morocco for accepting my talks and allow me to fulfill one of my goals for the year: becoming a Devoxx Champion. In 2018, I plan to slow down a bit and speak more in the US, concentrating on Java User Groups.

    However, 2017 isn't over! I'll be speaking at SpringOne and The Rich Web Experience next week. We're also planning a Devoxx4Kids Denver meetup in December and a Denver JUG Holiday Party as well.

    0 0

    Earlier this year, I wrote a series of blog posts on how to secure access to the Apache Hadoop filesystem (HDFS), using tools like Apache Ranger and Apache Atlas. In this post, we will go further and show how to authorize access to Apache Yarn using Apache Ranger. Apache Ranger allows us to create and enforce authorization policies based on who is allowed to submit applications to run on Apache Yarn. Therefore it can be used to enforce authorization decisions for Hive on Yarn or Spark on Yarn jobs.

    1) Installing Apache Hadoop

    First, follow the steps outlined in the earlier tutorial (section 1) on setting up Apache Hadoop, except that in this tutorial we will work with Apache Hadoop 2.8.2. In addition, we will need to follow some additional steps to configure Yarn (see here for the official documentation). Create a new file called 'etc/hadoop/mapred-site.xml' with the content:
    Next edit 'etc/hadoop/yarn-site.xml' and add:
    Now we can start Apache Yarn via 'sbin/'. We are going to submit jobs as a local user called "alice" to test authorization. First we need to create some directories in HDFS:

    • bin/hdfs dfs -mkdir -p /user/alice/input
    • bin/hdfs dfs -put etc/hadoop/*.xml /user/alice/input
    • bin/hadoop fs -chown -R alice /user/alice
    • bin/hadoop fs -mkdir /tmp
    • bin/hadoop fs -chmod og+w /tmp
    Now we can submit an example job as "alice" via:
    • sudo -u alice bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.2.jar grep input output 'dfs[a-z.]+'
    The job should run successfully and store the output in '/user/alice/output'. Delete this directory before trying to run the job again ('bin/hadoop fs -rm -r /user/alice/output').

    2) Install the Apache Ranger Yarn plugin

    Next we will install the Apache Ranger Yarn plugin. Download Apache Ranger and verify that the signature is valid and that the message digests match. Due to some bugs that were fixed for the installation process, I am using version 1.0.0-SNAPSHOT in this post. Now extract and build the source, and copy the resulting plugin to a location where you will configure and install it:
    • mvn clean package assembly:assembly -DskipTests
    • tar zxvf target/ranger-1.0.0-SNAPSHOT-yarn-plugin.tar.gz
    • mv ranger-1.0.0-SNAPSHOT-yarn-plugin ${ranger.yarn.home}
    Now go to ${ranger.yarn.home} and edit "". You need to specify the following properties:
    • POLICY_MGR_URL: Set this to "http://localhost:6080"
    • REPOSITORY_NAME: Set this to "YarnTest".
    • COMPONENT_INSTALL_DIR_NAME: The location of your Apache Hadoop installation
    Save "" and install the plugin as root via "sudo -E ./". Make sure that the user who is running Yarn has the permission to read the policies stored in '/etc/ranger/YarnTest'. There is one additional step to be performed in Hadoop before restarting Yarn. Edit 'etc/hadoop/ranger-yarn-security.xml' and add a property called "ranger.add-yarn-authorization" with value "false". This means that if Ranger policy authorization fails, it doesn't fall back to the default Yarn ACLs (which allow all users to submit jobs to the default queue).

    Finally, re-start Yarn and try to resubmit the job as "alice" as per the previous section. You should now see an authorization error: "User alice cannot submit applications to queue root.default".

    3) Create authorization policies in the Apache Ranger Admin console

    Next we will use the Apache Ranger admin console to create authorization policies for Yarn. Follow the steps in this tutorial to install the Apache Ranger admin service. Start the Apache Ranger admin service with "sudo ranger-admin start" and open a browser and go to "http://localhost:6080/" and log on with "admin/admin". Add a new Yarn service with the following configuration values:
    • Service Name: YarnTest
    • Username: admin
    • Password: admin
    • Yarn REST URL: http://localhost:8088
    Click on "Test Connection" to verify that we can connect successfully to Yarn + then save the new service. Now click on the "YarnTest" service that we have created. Add a new policy for the "root.default" queue for the user "alice" (create this user if you have not done so already under "Settings, Users/Groups"), with a permission of "submit-app".

    Allow up to 30 seconds for the Apache Ranger plugin to download the new authorization policy from the admin service. Then try to re-run the job as "alice". This time it should succeed due to the authorization policy that we have created.

    0 0

    A recent blog post covered how to install the Apache Kerby KDC. In this post we will build on that tutorial to show how to get a major new feature of Apache Kerby 1.1.0 to work - namely kerberos cross-realm support. Cross-realm support means that the KDCs in realm "A" and realm "B" are configured in such a way that a user who is authenticated in realm "A" can obtain a service ticket for a service in realm "B" without having to explicitly authenticate to the KDC in realm "B".

    1) Configure the KDC for the "EXAMPLE.COM" realm

    First we will configure the Apache Kerby KDC for the "EXAMPLE.COM" realm. Follow the previous tutorial to install and configure the KDC for this (default) realm. We need to follow some additional steps to get cross-realm support working with a second KDC in realm "EXAMPLE2.COM". Edit 'conf/krb5.conf' and replace the "realms" section with the following configuration:
    Next we need to add a special principal to the KDC to enable cross-realm support via (after restarting the KDC):

    • sh bin/ conf/ -k keytabs/admin.keytab
    • addprinc -pw security krbtgt/EXAMPLE2.COM@EXAMPLE.COM
    2) Configure the KDC for the "EXAMPLE2.COM" realm

    Now we will configure a second KDC for the "EXAMPLE2.COM" realm. Download the Apache Kerby source code as before. Unzip the source and build the distribution via:
    • mvn clean install -DskipTests
    • cd kerby-dist
    • mvn package -Pdist
    Copy "kdc-dist" to a location where you wish to install the second KDC. In this directory, create a directory called "keytabs" and "runtime". Edit 'conf/backend.conf' and change the value for 'backend.json.dir' to avoid conflict with the first KDC instance. Then create some keytabs via:
    • sh bin/ conf keytabs
    For testing purposes, we will change the port of the KDC from the default "88" to "54321" to avoid having to run the KDC with administrator privileges. Edit "conf/krb5.conf" and "conf/kdc.conf" and change "88" to "54321". In addition, change the realm from "EXAMPLE.COM" to "EXAMPLE2.COM" in both of these files. As above, edit 'conf/krb5.conf' and replace the "realms" section with the following configuration:
    Next start the KDC via:
    • sh bin/ conf runtime
    We need to add a special principal to the KDC to enable cross-realm support, as in the KDC for the "EXAMPLE.COM" realm. Note that it must be the same principal name and password as for the first realm. We will also add a principal for a service in this realm:
    • sh bin/ conf/ -k keytabs/admin.keytab
    • addprinc -pw security krbtgt/EXAMPLE2.COM@EXAMPLE.COM
    • addprinc -pw password service@EXAMPLE2.COM
    3) Obtaining a service ticket for service@EXAMPLE2.COM as alice@EXAMPLE.COM

    Now we can obtain a service ticket for the service we have configured in the "EXAMPLE2.COM" realm as a user who is authenticated to the "EXAMPLE.COM" realm. Configure the "tool-dist" distribution as per the previous tutorial, updating 'conf/krb5.conf' with the same "realms", "domain_realm" and "capaths" information as shown above. Now we can authenticate as "alice" and obtain a service ticket as follows:
    • sh bin/ -conf conf alice@EXAMPLE.COM
    • sh bin/ -conf conf -c /tmp/krb5cc_1000 -S service@EXAMPLE2.COM
    If you run "klist" then you should see that a ticket for "service@EXAMPLE2.COM" was obtained successfully.

      0 0

      I noticed that Firefox was not playing any sound any more. Apparently pulseaudio stopped working. Manually running pa showed the following error:

      xmalloc.c: Assertion 'size < (1024*1024*96)' failed at /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/xmalloc.c:72, function pa_xmalloc0(). Aborting.

      This means it is trying to allocate a ridiculous amount of memory.

      Looking at the backtrace in gdb reveals:

      (gdb) bt
      #0  0x00007ffff6a5ef50 in raise () from /lib64/
      #1  0x00007ffff6a60bfa in abort () from /lib64/
      #2  0x00007ffff7914b32 in pa_xmalloc0 () from /usr/lib64/
      #3  0x00007ffff7ba1db1 in pa_database_open () from /usr/lib64/pulseaudio/
      #4  0x00007fffeed60468 in module_card_restore_LTX_pa__init () from /usr/lib64/pulse-10.0/modules/
      #5  0x00007ffff7b5ac98 in pa_module_load () from /usr/lib64/pulseaudio/
      #6  0x00007ffff7b49751 in ?? () from /usr/lib64/pulseaudio/
      #7  0x00007ffff7b4fd2c in pa_cli_command_execute_line_stateful () from /usr/lib64/pulseaudio/
      #8  0x00007ffff7b50551 in pa_cli_command_execute_file_stream () from /usr/lib64/pulseaudio/
      #9  0x0000000000406e55 in main ()
      So it is trying to read some database. What database?

      strace tells us quickly:
      open("/home/xoxo/.pulse/91b2f1e2678a89a9c38b83075061a39a-card-database.x86_64-pc-linux-gnu.simple", O_RDONLY|O_CLOEXEC) = 9
      So likely this thingy is corrupt. Solved this with:

      rm -rf .pulse

      0 0

      Long time I didn't post anything here... :-)
      So after 4.5yo with a very happy life @ Melbourne, we have decided to move further north..
      Mid January we will start new adventures in Brisbane. Very exciting time for us (well kids a bit sad to leave friends but happy to try something different).
      Some have asking us why if you're happy here :-)
      Well a bit difficult to answer....
      Maybe tired of the so famous Melbourne weather....
      Yes the Brisbane weather looks so great: short pans and thongs all the year :-)
      Housing is ridiculous expensive @ Melbourne so time to stop paying the someone else mortgage
      After long time, it's maybe a good idea to leave the comfort zone and try something different.

      Cya soon up north :-)

      0 0

      바로 지난 주, 나와 클라우드플랫폼팀장 제임스 동행으로 여기어때 R&D센터에서 AWS 2017 행사를 참관했다. 덴버/샌프란시스코와는 전혀 다른 라스베가스 스트립의 좌우로 초대형 호텔과 카지노가 즐비한 풍경에 한번 놀랬고, Registration Booth 부터 그 큰 규모에서 두번 놀랬다. 참고로 라스베가스는 이렇게 사람이 많이 모일수 있는 환경을 가지고 있어 대형 세미나가 지속적으로 열린다고 한다 (CES도 동시에 개최된 걸로 안다). 4만 명이 넘는 참가자가 라스베가스 전역에 있는 호텔 행사장에 모였고, 우버 드라이버들은 AWS 컨퍼런스 왔냐고 자연스레 안부 인사를 묻더라. 뭐 이미 잘 알려져 있듯이 한국에 많은 기업이 AWS를 사용하고 있다. 나는 참고로 re:Play 행사에서 오다가다 과거 네이버 동료도 거의 10년만에 만나기도 하였다. :-)

      기술 세션들에 대한 리뷰는 모두 참석하지 못했기 때문에 내 눈이 정확하다 말할 순 없지만, 두 가지로 요약된다. Serverless 그리고 AI 다.

      Serverless는 프로그래밍 패러다임을 송두리째 바꾸고 있으며, 더 이상 풀스택 엔지니어, 그들의 인프라 노하우와 경험적 스킬은 (희박한 오픈소스 해커 수준 아니고서야) 대우받기 힘든 세상이 되었다.

      AI 분야도 마찬가지인데, 이미 학습능력이 클라우드 안으로 들어가기 시작했다.

      • 넷플릭스의 데이터 엔지니어로 유명한 배재현님의 글도 읽어볼 필요가 있음 

       내 눈에 들어온 슬라이드 한장 … 내가 가장 열성적으로 들었던 딥 러닝 서밋은 장장 4시간 연짱 세션이었고, 그중에 눈에 너무 꽃힌 장표 한장이 있으니 … 주로 내가 많이 얘기하던 내용과 비슷한 면이 있어 그럴수도 있는데, 사용자 경험은 OS에서 웹으로, 웹에서 모바일, 그리고 이제 AI로 넘어가는 과정이라고 볼 수 있다. 우리 IT 기술은 현재 어디에 머물고 있는가? :-)

       여러가지 생각들을 뒤로하고 귀국 바로 전 그랜드 캐년에 들러 광활한 세상을 느끼고 그렇게 한국에 돌아왔다 …

      0 0

      Your latest emerge -uavD @world may result in the following error:

      # required by sys-apps/portage-2.3.13-r1::gentoo[python_targets_python3_4,-build,python_targets_python2_7,-python_targets_python3_5]
      # required by virtual/package-manager-0::gentoo
      # required by @system
      # required by @world (argument)
      >=dev-python/pyblake2-0.9.3-r1 python_targets_python3_4
      That's because python-3.5 as well as python-3.6 went stable, that in turn causes python USE flag changes and portage can't figure out correctly what to do.

      Solve this by manually installing python-3.5 (only) first and recompiling the result USE flag changes:
      emerge -1av python:3.5
      eselect python update
      # add other packages that produce conflicts
      emerge -1av portage pyblake2 gentoolkit java-config
      emerge -1avD --changed-use @world
      That should let you update world again and depclean will remove python-3.4 for you.
      emerge -uavD @world
      emerge --depclean

      0 0

      We had the pleasure to release our monitoring dashboards designed for Apache Cassandra on Datadog last week. It is a nice occasion to share our thoughts around Cassandra Dashboards design as it is a recurrent question in the community.

      We wrote a post about this on the Datadog website here.

      For people using Datadog we hope this will give more details on how the dashboards were designed, thus on how to use the dashboards we provided. For others, we hope this information will be useful in the process of building and then using your own dashboards, with the technology of your choice.

      The Project

      Building an efficient, complete, and readable set of dashboards to monitor Apache Cassandra is time consuming and far from being straightforward.

      Those who tried it probably noticed it requires a fair amount of time and knowledge with both the monitoring technology in use (Datadog, Grafana, Graphite or InfluxDB, metrics-reporter, etc) and of Apache Cassandra. Creating dashboards is about picking the most relevant metrics, aggregations, units, chart type and then gather them in a way that this huge amount of data actually provides usable information. Dashboards need to be readable, understandable and easy to use for the final operator.

      On one hand, creating comprehensive dashboards is a long and complex task. On the other hand, every Apache Cassandra cluster can be monitored roughly the same way. Most production issues can be detected and analyzed using a common set of charts, organized the same way, for all the Apache Cassandra clusters. Each cluster may require additional operator specific dashboards or charts depending on workload and merging of metrics outside of Cassandra, but those would supplement the standard dashboards, not replace them. There are some differences depending on the Apache Cassandra versions in use, but they are relatively minor and not subject to rapid change.

      In my monitoring presentation at the 2016 Cassandra Summit I announced that we were working on this project.

      In December 2017 it was release for Datadog users. If you want to get started with these dashboards and you are using Datadog, see how to do this documentation on Datadog integration for Cassandra.

      Dashboard Design

      Our Approach to Monitoring

      The dashboards have been designed to allow the operator to do the following:

      1. Easily detect any anomaly (Overview Dashboard)
      2. Be able to efficiently troubleshoot and fix the anomaly (Themed Dashboards)
      3. Find the bottlenecks and optimize the overall performance (Themed Dashboards)

      The 2 later points above can be seen as the same kind of operations which can be supported by the same set of dashboards.

      Empowering the operator

      We strongly believe that showing the metrics to the operator can be a nice entry point for learning about Cassandra. Each of the themed dashboards monitor a distinct internal processes of Cassandra. Most of the metrics related to this internal process are then grouped up within a Dashboard. We think it makes it easier for the operator to understand Cassandra’s internal processes.

      To make it clearer, let’s consider the example of someone completely new to Cassandra. On first repair, the operator starts an incremental repair without knowing anything about it and latencies increase substantially after a short while. Classic.

      The operator would notice a read latency in the ‘Overview Dashboard’, then aim at the ‘Read Path Dashboard’. There the operator would be able to notice that the number of SSTables went from 50 to 800 on each node, or for a table. If the chart is there out of the box, even if not knowing what an SSTable is the operator can understand something changed there and that it relates to the outage somehow. The operator would then search in the right direction, probably solving the issue quickly, and possibly learning in the process.

      What to Monitor: Dashboards and Charts Detail

      Here we will be really focusing on charts details and indications on how to use each chart efficiently. While this post is a discussion of dashboards available for DataDog, the metrics can be visualized using any tool, and we believe this would be a good starting point when setting up monitoring for Cassandra.

      In the graphs, the values and percentiles chosen are sometime quite arbitrary and often depend on the use case or Cassandra setup. The point is to give a reference, a starting point on what could be ‘normal’ or ‘wrong’ values. The Apache Cassandra monitoring documentation, the mailing list archive, or #cassandra on #freenode (IRC) are good ways to answer questions that might pop while using dashboards.

      Some dashboards are voluntary duplicated across dashboards or within a dashboard, but with distinct visualisation or aggregation.

      Detect anomalies: Overview Dashboard

      We don’t try to troubleshoot at this stage. We want to detect outages that might impact the service or check that the Cassandra cluster is globally healthy. To accomplish this, this Overview Dashboard aims at both being complete and minimalist.

      Complete as we want to be warned anytime “something is happening“ in the Cassandra cluster. Minimalist because we don’t want to miss an important information here because of the flood of non-critical or too low level informations. These charts aim answer the question: “Is Cassandra healthy?”.

      TLP Dashboards - Overview

      Troubleshoot issues and optimize Apache Cassandra: Themed dashboards

      The goal here is to divide the information into smaller, more meaningful chunks. When having an issue, it will often only affect one of the subsystems of Cassandra, so the operator can have all the needed information in one place when working on a specific issue, without having irrelevant informations (for this specific issue) hiding more important information.

      For this reason these dashboards must maximize the information on a specific theme or internal process of Cassandra and show all the low level information (per table, per host). We are often repeating charts from other dashboards, so we always find the information we need as Cassandra users. This is the contrary to the overview dashboard needs mentioned above that just shows “high level” information.

      Read Path Dashboard

      In this dashboard we are concerned about any element that could impact a high level client read. In fact, we want to know about everything that could affect the read path in Cassandra by just looking at this dashboard.

      TLP Dashboards - Read Path Dashboard

      Write Path Dashboard

      This dashboard focuses on a comprehensive view of the various metrics which affect write latency and throughput. Long garbage collection pause times will always result in dips in throughput and spikes in latency, so it is featured prominently on this dashboard.

      TLP Dashboards - Write Path Dashboard

      SSTable management Dashboard

      This dashboard is about getting a comprehensive view of the various metrics which impact the asynchronous steps the data goes through after a write, from the flush to the data deletion with all the compaction processes in between. Here we will be willing to be aware of disk space evolution and make sure asynchronous management of SSTables is happening efficiently or as expected.

      TLP Dashboards - SSTable Management Dashboard

      Alerting, Automated Anomaly Detection.

      To conclude, when happy with monitoring dashboards, it is a good idea to add some alerting rules.

      It is important to detect all the anomalies as quickly as possible. To bring monitoring to the next level of efficiency, it is good to be warned automatically when something goes wrong.

      We believe adding alerts on each of the “Overview Dashboard” metrics will be sufficient to detect most issues and any major outage, or at least be a good starting point. For each metric, the alerting threshold should be high enough not to trigger false alerts to ensure a mitigating action can be taken. Some alerts should use absolute value (Disk space available, CPU, etc), while others will require relative values. Manually tuning some alerts will be required based on configuration and workload, such as alerting on the latencies.

      The biggest risk on alerting is probably to be flooded by false alerts as the natural inclination to start ignoring them, which leads to missing valid ones. As a global guideline, any alert should trigger an action, if it does not, this alert is relatively useless and adds noise.

      0 0

      Service Workers enable a web application to be responsive even if the network isn't. Frameworks like AngularJS, React and Vue.js enable web applications to efficiently update and render web pages as data changes.

      The Apache Software Foundation's Whimsy board agenda application uses both in combination to achieve a responsive user experience - both in terms of quick responses to user requests and quick updates based on changes made on the server.

      From a performance perspective, the two cases easiest to optimize for are (1) the server fully up and running accessed across a fast network with all possible items cached, and (2) the application fully offline as once you make offline possible at all, it will be fast.

      The harder cases ones where the server has received a significant update and needs to get that information to users, and even harder is when the server has no instances running and needs to spin up a new instance to process a request. While it is possible to do blue/green deployment for applications that are "always on", this isn't practical or appropriate for applications which only used in periodic bursts. The board agenda tool is one such application.

      This article describes how a goal of sub-second response time is achieved in such an environment. There are plenty of articles on the web that show snippets or sanitized approaches, this one focuses on real world usage.

      Introduction to Service Workers

      Service Workers are JavaScript files that can intercept and provide responses to navigation and resource requests. Service Workers are supported today by Chrome and FireFox, and are under development in Microsoft Edge and WebKit/Safari.

      Service Workers are part of a larger effort dubbed "Progressive Web Apps" that aim to make web applications reliable and fast, no matter what the state of the network may happen to be. The word "progressive" in this name is there to indicate that these applications will work with any browser to the best of that browser's ability.

      The signature or premier feature of Service Workers is offline applications. Such web applications are loaded normally the first time, and cached. When offline, requests are served by the cache, and any input made by users can be stored in local storage or in an index db. and The Offline Cookbook provide a number of recipes that can be used.

      Overview of the Board Agenda Tool

      This information is for background purposes only. Feel free to skim or skip.

      The ASF Board meets monthly, and minutes are published publicly on the web. A typical meeting has over one hundred agenda items, though the board agenda tool assists in resolving most off them offline, leaving a manageable 9 officer reports, around 20 PMC reports that may or may not require action, and a handful of special orders.

      While the full agenda is several thousand lines long, this file size is only a quarter of a megabyte or the size of a small image. The server side of this application parses the agenda and presents it to the client in JSON format, and the result is roughly the same size as the original.

      To optimize the response of the first page access, the server is structured to do server side rendering of the page that is requested, and the resulting response starts with links to stylesheets, then contains the rendered HTML, and finally any scripts and data needed. This allows the browser to incrementally render the page as it is received. This set of scripts includes a script that can render any page (or component) that the board agenda tool can produce, and the data includes all the information necessary to do so. The current implementation is based on Vue.js.

      Once loaded, traversals between pages is immeasurably quick. By that I mean that you can go to the first page and lean on the right arrow button and pages will smoothly scroll through the pages by at roughly the rate at which you can see the faces in a deck of cards shuffled upside down.

      The pages generally contain buttons and hidden forms; which buttons appear often depends on the user who requests the page. For example, only Directors will see approve and unapprove buttons; and individual directors will only see one of these two buttons based on whether or not they have already approved the report.

      A WebSocket between the server and client is made mostly so the server can push changes to each client; changes that then cause re-rendering and updated displays. Requests from the client to the server generally are done via XMLHttpRequest as it wasn't until very recently that Safari supported fetch. IE still doesn't, but Edge does.

      Total (uncompressed) size of the application script is another quarter of a megabyte, and dependencies include Vue.js and Bootstrap, the latter being the biggest requiring over a half a megabyte of minimized CSS.

      All scripts and stylesheets are served with a Cache-Control: immutable header as well as an expiration date a year from when the request was made. This is made possible by the expedient of utilizing a cache busting query string that contains the last modified date. Etag and 304 responses are also supported.

      Offline support was added recently. Updates made when offline are stored in an IndexDB and sent as a batch when the user returns online. Having all of the code and data to render any page made this support very straightforward.

      Performance observations (pre-optimization)

      As mentioned at the top of this article, offline operations are virtually instantaneous. Generally, immeasurably so. As described above, this also applies to transitions between pages.

      This leaves the initial visit, and returning visits, the latter includes opening the application in new tabs.

      Best case response times for these cases is about a second. This may be due to the way that server side rendering is done or perhaps due to the fact that each page is customized to the individual. Improving on this is not a current priority, though the solution described later in this article addresses this.

      Worst case response times are when there are no active server processes and all caches (both server side and client side) are either empty or stale. It is hard to get precise numbers for this, but it is on the order of eight to ten seconds. Somewhere around four is the starting of the server. Building the JSON form of the agenda can take another two given all of the validation (involving things like LDAP queries) involved in the process. Regenerating the ES5 JavaScript from sources can take another second or so. Producing the custom rendered HTML is another second. And then there is all of the client side processing.

      In all, probably just under ten seconds if the server is otherwise idle. It can be a little more if the server is under moderate to heavy load.

      The worst parts of this:

      1. No change is seen on the browser window until the last second or so.
      2. While the worst case scenario is comparatively rare in production, it virtually precisely matches what happens in development.

      Selecting an approach

      Given that the application can be brought up quickly in an entirely offline mode, one possibility would be to show the last cached status and then request updated information and process that information when received. This approach works well if the only change is to agenda data, but doesn't work so well in production whenever a script change is involved.

      This can be solved with a window.location.reload() call, which is described (and somewhat discouraged) as approach #2 in Dan Fabulic's "How to Fix the Refresh Button When Using Service Workers". Note the code below was written before Dan's page was published, but in any case, Dan accurately describes the issue.

      Taking some measurements on this produces interesting results. What is needed to determine if a script or stylesheet has changed is a current inventory from the server. This can consistently be provided quickly and is independent of the user requesting the data, so it can be cached. But since the data size is small enough, caching (in the sense of HTTP 304 reponses) isn't all that helpful.

      Response time for this request in realistic network conditions when there is an available server process is around 200 milliseconds, and doesn't tend to vary very much.

      The good news is that this completely addresses the "reload flash" problem.

      Unfortunately, the key words here are "available server process" as that was the original problem to solve.

      Fortunately, a combination approach is possible:

      1. Attempt to fetch the inventory page from the network, but give it a deadline that it should generally beat. Say, 500 milliseconds or a half a second.
      2. If the deadline isn't met, load potentially stale data from the cache, and request newer data. Once the network response is received (which had a 500 millisecond head start), determine if any scripts or stylesheets changed. If not, we are done.
      3. Only if the deadline wasn't met AND there was a change to a stylesheet or more commonly a script, perform a reload; and figure out a way to address the poor user experience associated with a reload.

      Additional exploration lead to the solution where the inventory page mentioned below could be formatted in HTML and, in fact, be the equivalent to a blank agenda page. Such a page would still be less than 2K bytes, and performance would be equivalent to loading a blank page and then navigating to the desired page, in other words, immeasurably fast.


      If you look at existing recipes, Network or Cache is pretty close; the problem is that it leaves the user with stale data if the network is slow. It can be improved upon.

      Starting with the fetch from the network:

        // attempt to fetch bootstrap.html from the network
        fetch(request).then(function(response) {
          // cache the response if OK, fulfill the response if not timed out
          if (response.ok) {
            cache.put(request, response.clone());
            // preload stylesheets and javascripts
            if (/bootstrap\.html$/.test(request.url)) {
              response.clone().text().then(function(text) {
                var toolate = !timeoutId;
                  function() {
                    preload(cache, request.url, text, toolate)
                  (toolate ? 0 : 3000)
            if (timeoutId) {
          } else {
            // bad response: use cache instead
        }).catch(function(failure) {
          // no response: use cache instead

      This code needs to be wrapped in a Promise that provides a resolve function, and needs access to a cache as well as a variable named timeoutid and that determines whether or not the response has timed out.

      If the response is ok, it and will be cached and a preload method will be called to load resources mentioned in the page. That will either be done immediately if not toolate, or after a short delay the timer expired to allow updates to be processed. Finally, if such a response was received in time, the timer will be cleared, and the promise will be resolved.

      If either a bad response or no response was received (typically, this represents a network failure), the cache will be used instead.

      Next the logic to reply from the cache:

        // common logic to reply from cache
        var replyFromCache = function(refetch) {
          return cache.match(request).then(function(response) {
            if (response) {
              timeoutId = null
            } else if (refetch) {
              fetch(event.request).then(resolve, reject)
        // respond from cache if the server isn't fast enough
        timeoutId = setTimeout(function() {replyFromCache(false)}, timeout);

      This code looks for a cache match, and if it finds one, it will resolve the response, and clear the timeoutId enabling the fetch code to detect if it was too late.

      If no response is found, the action taken will be determined by the refetch argument. The fetch logic above passes true for this, and the timeout logic passes false. If true, it will retry the original request (which presumably will fail) and return that result to the user. This is handling a never should happen scenario where the cache doesn't contain the bootstrap page.

      The above two snippets of code are then wrapped by a function, providing the event, resolve, reject, and cache variables, as well as declaring and initializing the timeoutId variable:

      // Return a bootstrap.html page within 0.5 seconds.  If the network responds
      // in time, go with that response, otherwise respond with a cached version.
      function bootstrap(event, request) {
        return new Promise(function(resolve, reject) {
          var timeoutId = null;
"board/agenda").then(function(cache) {

      Next, we need to implement the preload function:

      // look for css and js files and in HTML response ensure that each are cached
      function preload(cache, base, text, toolate) {
        var pattern = /"[-.\w+/]+\.(css|js)\?\d+"/g;
        var count = 0;
        var changed = false;
        while (match = pattern.exec(text)) {
          var path = match[0].split("\"")[1];
          var request = new Request(new URL(path, base));
          cache.match(request).then(function(response) {
            if (response) {
            } else {
              fetch(request).then(function(response) {
                if (response.ok) cacheReplace(cache, request, response);
                if (count == 0 && toolate) {
                  clients.matchAll().then(function(clients) {
                    clients.forEach(function(client) {
                      client.postMessage({type: "reload"})

      This code parses the HTML response, looking for .css, and .js files, based on a knowledge as to how this particular server will format the HTML. For each such entry in the HTML, the cache is searched for a match. If one is found, nothing more needs to be done. Otherwise, the resource is fetched and placed in the cache.

      Once all requests are processed, and if this involved requesting a response from the network, then a check is made to see if this was a late response, and if so, a reload request is sent to all client windows.

      cacheReplace is another application specific function:

      // insert or replace a response into the cache.  Delete other responses
      // with the same path (ignoring the query string).
      function cacheReplace(cache, request, response) {
        var path = request.url.split("?")[0];
        cache.keys().then(function(keys) {
          keys.forEach(function(key) {
            if (key.url.split("?")[0] == path && key.url != path) {
              cache.delete(key).then(function() {})
        cache.put(request, response)

      The purpose of this method is as stated: to delete from the cache other responses that differ only in the query string. It also adds the response to the cache.

      The remainder is either straightforward or application specific in a way that has no performance relevance. The scripts and stylesheets are served with a cache falling back to network strategy. The initial preloading which normally could be as simple as a call to cache.addAll needs to be aware of query strings and for this application it turns out that a different bootstrap HTML file is needed for each meeting.

      Finally, here is the client side logic which handles reload messages from the service worker:

      navigator.serviceWorker.register(scope + "sw.js", scope).then(function() {
        // watch for reload requests from the service worker
        navigator.serviceWorker.addEventListener("message", function(event) {
          if ( == "reload") {
            // ignore reload request if any input or textarea element is visible
            var inputs = document.querySelectorAll("input, textarea");
            if (Math.max.apply(
                return element.offsetWidth
            ) <= 0) window.location.reload()

      This code watches for type: "reload" messages from the service worker and invokes window.location.reload() only if there are no input or text area elements visible, which is determined using the offsetWidth property of each element. Very few board agenda pages have visible input fields by default; many, however, have bootstrap modal dialog boxes containing forms.

      Performance Results

      In production when using a browser that supports Service Workers, requests for the bootstrap page now typically range from 100 to 300 milliseconds, with the resulting page fully loaded in 400 to 600 milliseconds. Generally, this includes the time it takes to fetch and render updated data, but in rare cases that may take up to an additional 200 milliseconds.

      In development, and in production when there are no server processes available and when accessed using a browser that supports Service Workers, the page initially loads in 700 to 1200 milliseconds. It is not clear to me why this sees a greater range of response times; but in any case, this is still a notable improvement. Often in development, and in rare cases in production, there may be a noticeable refresh that occurs one to five seconds later.

      Visitations by browsers that do not support service workers, and for that matter the first time a new user visits the board agenda tool, do not see any performance improvement or degradation with these changes.

      Not a bad result from less than 100 lines of code.

      0 0

      Gentoo has new profiles that require you to "recompile everything". That is technically not really necessary. Only static libraries really need recompiling.

      Here is why:
      A static library is just an archive of .o files (similar to tar), nothing more, and linking against a static library is roughly the same as just adding more .o files to the linker line. You can also link a static library into a shared library - the code in the static library is then just copied into the shared library (but the code then must be compiled with -fPIC, as with all other code that is used in shared libraries).

      You can find static libs like so:

      equery b $(find /usr/lib/ /lib/ -name *.a) | awk '{ print $1; }' | sort | uniq
      Typically this yields packages like elfutils, libbsd, nss, iproute2, keyutils, texinfo, flex, db, numactl.

      0 0

      Apache Syncope is a powerful open source Identity Management project, that has recently celebrated 5 years as an Apache top level project. Up to recently, a username and password must be supplied to log onto either the admin or enduser web consoles of Apache Syncope. However SAML SSO login is now supported since the 2.0.3 release. Instead of supplying a username/password, the user is redirected to a third party IdP for login, before redirecting back to the Apache Syncope web console. In 2.0.5, support for the IdP-initiated flow of SAML SSO was added.

      In this post we will show how to configure Apache Syncope to use SAML SSO as an alternative to logging in using a username and password. We will use Apache CXF Fediz as the SAML SSO IdP. In addition, we will show how to achieve IdP-initiated SSO using Okta. Please also refer to this tutorial on achieving SAML SSO with Syncope and Shibboleth.

      1) Logging in to Apache Syncope using SAML SSO

      In this section, we will cover setting up Apache Syncope to re-direct to a third party IdP so that the user can enter their credentials. The next section will cover the IdP-initiated case.

      1.a) Enable SAML SSO support in Apache Syncope

      First we will configure Apache Syncope to enable SAML SSO support. Download and extract the most recent standalone distribution release of Apache Syncope (2.0.6 was used in this post). Start the embedded Apache Tomcat instance and then open a web browser and navigate to "http://localhost:9080/syncope-console", logging in as "admin" and "password".

      Apache Syncope is configured with some sample data to show how it can be used. Click on "Users" and add a new user called "alice" by clicking on the subsequent "+" button. Specify a password for "alice" and then select the default values wherever possible (you will need to specify some required attributes, such as "surname"). Now in the left-hand column, click on "Extensions" and then "SAML 2.0 SP". Click on the "Service Provider" tab and then "Metadata". Save the resulting Metadata document, as it will be required to set up the SAML SSO IdP.

      1.b) Set up the Apache CXF Fediz SAML SSO IdP

      Next we will turn our attention to setting up the Apache CXF Fediz SAML SSO IdP. Download the most recent source release of Apache CXF Fediz (1.4.3 was used for this tutorial). Unzip the release and build it using maven ("mvn clean install -DskipTests"). In the meantime, download and extract the latest Apache Tomcat 8.5.x distribution (tested with 8.5.24). Once Fediz has finished building, copy all of the "IdP" wars (e.g. in fediz-1.4.3/apache-fediz/target/apache-fediz-1.4.3/apache-fediz-1.4.3/idp/war/fediz-*) to the Tomcat "webapps" directory.

      There are a few configuration changes to be made to Apache Tomcat before starting it. Download the HSQLDB jar and copy it to the Tomcat "lib" directory. Next edit 'conf/server.xml' and configure TLS on port 8443:

      The two keys referenced here can be obtained from 'apache-fediz/target/apache-fediz-1.4.3/apache-fediz-1.4.3/examples/samplekeys/' and should be copied to the root directory of Apache Tomcat. Tomcat can now be started.

      Next we have to configure Apache CXF Fediz to support Apache Syncope as a "service" via SAML SSO. Edit 'webapps/fediz-idp/WEB-INF/classes/entities-realma.xml' and add the following configuration:

      In addition, we need to make some changes to the "idp-realmA" bean in this file:

      • Add a reference to this bean in the "applications" list: <ref bean="srv-syncope" />
      • Change the "idpUrl" property to: https://localhost:8443/fediz-idp/saml
      • Change the port for "stsUrl" from "9443" to "8443".
      Now we need to configure Fediz to accept Syncope's signing cert. Edit the Metadata file you saved from Syncope in step 1.a. Copy the Base-64 encoded certificate in the "KeyDescriptor" section, and paste it (including line breaks) into 'webapps/fediz-idp/WEB-INF/classes/syncope.cert', enclosing it in between "-----BEGIN CERTIFICATE-----" and "-----END CERTIFICATE-----".

      Now restart Apache Tomcat. Open a browser and save the Fediz metadata which is available at "http://localhost:8080/fediz-idp/metadata?protocol=saml", which we will require when configuring Apache Syncope.

      1.c) Configure the Apache CXF Fediz IdP in Syncope

      The final configuration step takes place in Apache Syncope again. In the "SAML 2.0 SP" configuration screen, click on the "Identity Providers" tab and click the "+" button and select the Fediz metadata that you saved in the previous step. Now logout and an additional login option can be seen:

      Select the URL for the SAML SSO IdP and you will be redirected to Fediz. Select the IdP in realm "A" as the home realm and enter credentials of "alice/ecila" when prompted. You will be successfully authenticated to Fediz and redirected back to the Syncope admin console, where you will be logged in as the user "alice". 

      2) Using IdP-initiated SAML SSO

      Instead of the user starting with the Syncope web console, being redirected to the IdP for authentication, and then redirected back to Syncope - it is possible instead to start from the IdP. In this section we will show how to configure Apache Syncope to support IdP-initiated SAML SSO using Okta.

      2.a) Configuring a SAML application in Okta

      The first step is to create an account at Okta and configure a SAML application. This process is mapped out at the following link. Follow the steps listed on this page with the following additional changes:
      • Specify the following for the Single Sign On URL: http://localhost:9080/syncope-console/saml2sp/assertion-consumer
      • Specify the following for the audience URL: http://localhost:9080/syncope-console/
      • Specify the following for the default RelayState: idpInitiated
      When the application is configured, you will see an option to "View Setup Instructions". Open this link in a new tab and find the section about the IdP Metadata. Save this to a local file and set it aside for the moment. Next you need to assign the application to the username that you have created at Okta.

      2.b) Configure Apache Syncope to support IdP-Initiated SAML SSO

      Log on to the Apache Syncope admin console using the admin credentials, and add a new IdP Provider in the SAML 2.0 SP extension as before, using the Okta metadata file that you have saved in the previous section. Edit the metadata and select the 'Support Unsolicited Logins' checkbox. Save the metadata and make sure that the Okta user is also a valid user in Apache Syncope.

      Now go back to the Okta console and click on the application you have configured for Apache Syncope. You should seemlessly be logged into the Apache Syncope admin console.

      0 0

      Apache Syncope is a powerful open source Identity Management project, covered extensively on this blog. Amongst many other features, it allows the management of three core types - Users, Groups and "Any Objects", the latter which can be used to model arbitrary types. These core types can be accessed via a flexible REST API powered by Apache CXF. In this post we will explore the concept of "membership" in Apache Syncope, as well as a new feature that was added for Syncope 2.0.7 which allows an easy way to see membership counts.

      1) Membership in Apache Syncope

      Users and "Any Objects" can be members of Groups in two ways - statically and dynamically. "Static" membership is when the User or "Any Object" is explicitly assigned membership of a given Group. "Dynamic" membership is when the Group is defined with a set of rules, which if they evaluate to true for a given User or "Any Object", then that User or "Any Object" is a member of the group. For example, a User could be a dynamic member of a group based on the value for a given User attribute. So we could have an Apache group with a dynamic User membership rule of "*" matching an "email" attribute.

      2) Exploring group membership via the REST API

      Let's examine group membership with some practical examples. Start Apache Syncope and log in to the admin console. Click on "Groups" and add a new group called "employee", accepting the default options. Now click on the "User" tab and add new Users called "alice" and "bob", with static membership of the "employee" group.

      Using a tool like "curl", we can access the REST API using the admin credentials to obtain information on "alice":

      • curl -u admin:password http://localhost:9080/syncope/rest/users/alice
      Note that "alice" has a "memberships" attribute pointing to the "employee" group. Next we can see information on the "employee" group via:
      • curl -u admin:password http://localhost:9080/syncope/rest/groups/employee
      3) Obtaining membership counts

      Now consider obtaining the membership count of a given group. Let's say we are interested in finding out how many employees we have - how can this be done? Prior to Apache Syncope 2.0.7, we have to leverage the power of FIQL which underpins the search capabilities of the REST API of Apache Syncope:
      • curl -u admin:password http://localhost:9080/syncope/rest/users?fiql=%24groups==employee
      In other words, search for all Users who are members of the "employee" group. This returns a long list of all Users, even though all we care about is the count (which is encoded in the "totalCount" attribute). There is a new way to do this Apache Syncope 2.0.7. Instead of having to search for Users, membership counts are now encoded in groups. So we can see the total membership counts for a given group just by doing a GET call:
      • curl -u admin:password http://localhost:9080/syncope/rest/groups/employee
      Following the example above, you should see an "staticUserMembershipCount" attribute with a value of "2". Four new attributes are defined for GroupTO:
      • staticUserMembershipCount: The static user membership count of a given group
      • dynamicUserMembershipCount: The dynamic user membership count of a given group
      • staticAnyObjectMembershipCount: The static "Any Object" membership count of a given group
      • dynamicAnyObjectMembershipCount: The dynamic "Any Object" membership count of a given group.
      Some consideration was given to returning the Any Object counts associated with a given Any Object type, but this was abandoned due to performance reasons.

      0 0

      After seeing a lot of questions surrounding incremental repair on the mailing list and after observing several outages caused by it, we figured it would be good to write down our advices in a blog post.

      Repair in Apache Cassandra is a maintenance operation that restores data consistency throughout a cluster. It is advised to run repair operations at leasts every gc_grace_seconds to ensure that tombstones will get replicated consistently to avoid zombie records if you perform DELETE statements on your tables.

      Repair also facilitates recovery from outages that last longer than the hint window, or in case hints were dropped. For those operators already familiar with the repair concepts, there were a few back-to-basics moments when the behavior of repair changed significantly in the release of Apache Cassandra 2.2. The introduction of incremental repair as the default along with the generalization of anti-compaction created a whole new set of challenges.

      How does repair work?

      To perform repairs without comparing all data between all replicas, Apache Cassandra uses merkle trees to compare trees of hashed values instead.

      Merkle tree

      During a repair, each replica will build a merkle tree, using what is called a “validation compaction”. It is basically a compaction without the write phase, the output being a tree of hashes.

      Validation compaction

      Merkle trees will then be compared between replicas to identify mismatching leaves, each leaf containing several partitions. No difference check is made on a per partition basis : if one partition in a leaf is not in sync, then all partitions in the leaf are considered as not being in sync. When more data is sent over than is required it’s typically called overstreaming. Gigabytes of data can be streamed, even for one bit of difference. To mitigate overstreaming, people started performing subrange repairs by specifying the start/end tokens to repair by smaller chunks, which results in having less partitions per leaf.

      With clusters growing in size and density, performing repairs within gc_grace_seconds started to get more and more challenging, with repairs sometimes lasting for tens of days. Some clever folks leveraged the immutable nature of SSTables and introduced incremental repair in Apache Cassandra 2.1.

      What is incremental repair?

      The plan with incremental repair was that once some data had been repaired, it would be marked as such and never needed to be repaired anymore.
      Since SSTables can contain tokens from multiple token ranges, and repair is performed by token range, it was necessary to be able to separate repaired data from unrepaired data. That process is called anticompaction.


      Once a repair session ends, each repaired SSTable will be split into 2 SSTables : one that contains the data that was repaired in the session (ie : data that belonged to the repaired token range) and another one with the remaining unrepaired data. The newly created SSTable containing repaired data will be marked as such by setting its repairedAt timestamp to the time of the repair session.
      When performing validation compaction during the next incremental repair, Cassandra will skip the SSTables with a repairedAt timestamp higher than 0, and thus only compare data that is unrepaired.

      Incremental repair

      Incremental repair was actually promising enough that it was promoted as the default repair mode in C* 2.2, and anticompaction was since then also performed during full repairs.
      To say the least, this was a bit of a premature move from the community as incremental repair has a few very annoying drawbacks and caveats that would make us consider it an experimental feature instead.

      The problems of incremental repair

      The most nasty one is filed in the Apache Cassandra JIRA as CASSANDRA-9143 with a fix ready for the unplanned 4.0 release. Between validation compaction and anticompaction, an SSTable that is involved in a repair can be compacted away as part of the standard compaction process on one node and not on the others. Such an SSTable will not get marked as repaired on that specific node while the rest of the cluster will consider the data it contained as repaired.
      Thus, on the next incremental repair run, all the partitions contained by that SSTable will be seen as inconsistent and it can generate a fairly large amount of overstreaming. This is a particularly nasty bug when incremental repair is used in conjunction with Level Compaction Strategy (LCS). LCS is a very intensive strategy where SSTables get compacted way more often than with STCS and TWCS. LCS creates fixed sized SSTables, which can easily lead to have thousands of SSTables for a single table. The way streaming occurs in Apache Cassandra during repair makes that overstreaming of LCS tables could create tens of thousands of small SSTables in L0 which can ultimately bring nodes down and affect the whole cluster. This is particularly true when the nodes use a large number of vnodes.
      We have seen happening on several customers clusters, and it requires then a lot of operational expertise to bring back the cluster to a sane state.

      In addition to the bugs related to incorrectly marked sstables, there is significant overhead of anti-compaction. It was kind of a big surprise for users upgrading from 2.0/2.1 to 2.2 when trying to run repair. If there is already a lot of data on disk, the first incremental repair can take a lot of time (if not forever) and create a similar situation as above with a lot of SSTables being created due to anticompaction. Keep in mind that anticompaction will rewrite all SSTables on disk to separate repaired and unrepaired data.
      While it’s not necessary anymore to “prepare” the migration to incremental repair, we would strongly advise against running it on a cluster with a lot of unrepaired data, without first marking SSTables as repaired. This would require to run a full repair first to make sure data is actually repaired, but now even full repair performs anticompaction, so… you see the problem.

      A safety measure has been set in place to prevent SSTables going through anticompaction to be compacted, for valid reasons. The problem is that it will also prevent that SSTable from going through validation compaction which will lead repair sessions to fail if an SSTable is being anticompacted. Given that anticompaction also occurs with full repairs, this creates the following limitation : you cannot run repair on more than one node at a time without risking to have failed sessions due to concurrency on SSTables. This is true for incremental repair but also full repair, and it changes a lot of the habit you had to run repair in previous versions.

      The only way to perform repair without anticompaction in “modern” versions of Apache Cassandra is subrange repair, which fully skips anticompaction. To perform a subrange repair correctly, you have three options :

      Regardless, it is extremely important to note that repaired and unrepaired SSTables can never be compacted together. If you stop performing incremental repairs once you started, you could end up with outdated data not being cleaned up on disk due to the presence of the same partition in both states. So if you want to continue using incremental repair, make sure it runs very regularly, and if you want to move back to full subrange repairs you will need to mark all SSTables as unrepaired using sstablerepairedset.

      Note that due to subrange repair not performing anti-compaction, is not possible to perform subrange repair in incremental mode.

      Repair : state of the art in late 2017

      Here’s our advice at the time of writing this blog post, based on our experience with customers : perform full subrange repair exclusively for now and do not ever run incremental repair. Just pretend that feature does not exist for now.

      While the idea behind incremental repair is brilliant, the implementation still has flaws that can cause severe damage to a production cluster, especially when using LCS and DTCS. The improvements and fixes planned for 4.0 will need to be thoroughly tested to prove they fixed incremental repair and allow it to be safely used as a daily routine.

      We are confident that future releases will make incremental repair better, allowing the operation to be safe and blazing fast compared to full repairs.

      0 0

      I originally published this article on SD Times, republishing it to keep it around for posterity…

      If you’re looking at embracing open source today, you might be a bit late to the game. Using open-source software is mainstream now, and being involved in open-source projects is nothing to write home about either. Everybody does it, we know how it works, its value is proven.

      But what’s next? Sharing source code openly is a given in open-source projects, but in the end it’s only about sharing lines of text. The real long-term power of successful open-source projects lies in how their communities operate, and that’s where open development comes in.

      Shared communications channels. Meritocracy. Commit early, commit often. Back your work by issues in a shared tracker. Archive all discussions, decisions and issues about your project, and make that searchable. All simple principles that, when combined, make a huge difference to the efficiency of our corporate projects.

      But, of course, the chaotic meritocracies of open-source projects won’t work for corporate teams, right? Such teams require a chain of command with strictly defined roles. Corporate meritocracy? You must be kidding.

      I’m not kidding, actually: Open development works very well in corporate settings, and from my experience in both very small and fairly large organizations, much better than old-fashioned top-to-bottom chains of command and information segregation principles. Empower your engineers, trust them, make everything transparent so that mistakes can be caught early, and make sure the project’s flow of information is continuous and archived. Big rewards are just around the corner—if you dive in, that is.

      What’s open development?
      Open development starts by using shared digital channels to communicate between project members, as opposed to one-to-one e-mails and meetings. If your team’s e-mail clients are their knowledge base, that will go away with them when they leave, and it’s impossible for new project members to acquire that knowledge easily.

      A centralized channel, like a mailing list, allows team members to be aware of everything that’s going on. A busy mailing list requires discipline, but the rewards are huge in terms of spreading knowledge around, avoiding duplicate work and providing a way for newcomers to get a feel for the project by reviewing the discussion archives. At the Apache Software Foundation, we even declare that “If it didn’t happen on the dev list, it didn’t happen,” which is a way of saying that whatever is worth saying must be made available to all team members. No more discrepancies in what information team members get; it’s all in there.

      The next step is sharing all your code openly, all the time, with all stakeholders. Not just in a static way, but as a continuous flow of commits that can tell you how fast your software is evolving and where it’s going, in real time.

      Software developers will sometimes tell you that they cannot show their code because it’s not finished. But code is never finished, and it’s not always beautiful, so who cares? Sharing code early and continuously brings huge benefits in terms of peer reviews, learning from others, and creating a sense of belonging among team members. It’s not “my” code anymore, it’s “our” code. I’m happy when someone sees a way to improve it and just does it, sometimes without even asking for permission, because the fix is obvious. One less bug, quality goes up, and “shared neurons in the air” as we sometimes say: all big benefits to a team’s efficiency and cohesion.

      Openly sharing the descriptions and resolutions of issues is equally important and helps optimize usage of a team’s skills, especially in a crisis. As in a well-managed open-source project, every code change is backed by an issue in the tracker, so you end up with one Web page per issue, which tells the full history of why the change was made, how, when, and by whom. Invaluable information when you need to revisit the issue later, maybe much later when whoever wrote that code is gone.

      Corporate projects too often skip this step because their developers are co-located and can just ask their colleague next door directly. By doing that, they lose an easy opportunity to create a living knowledgebase of their projects, without much effort from the developers. It’s not much work to write a few lines of explanation in an issue tracker when an issue is resolved, and, with good integration, rich links will be created between the issue tracker and the corresponding source code, creating a web of valuable information.

      The dreaded “When can we ship?” question is also much easier to answer based on a dynamic list of specific issues and corresponding metadata than by asking around the office, or worse, having boring status meetings.

      The last critical tool in our open development setup is in self-service archives of all that information. Archived mailing lists, resolved issues that stay around in the tracker, source-code control history, and log messages, once made searchable, make project knowledge available in self-service to all team members. Here as well, forget about access control and leave everything open. You want your engineers to be curious when they need to, and to find at least basic information about everything that’s going on by themselves, without having to bother their colleagues with meetings or tons of questions. Given sufficient self-service information, adding more competent people to a project does increase productivity, as people can largely get up to speed on their own.

      While all this openness may seem chaotic and noisy to the corporate managers of yesterday, that’s how open-source projects work. The simple fact that loose groups of software developers with no common boss consistently produce some of the best software around should open your eyes. This works.

      0 0

      One of the main principles at Apache (as in The Apache Software Foundation) is "Community over Code" - having the goal to build projects that survive single community members loosing interest or time to contribute.

      In his book "Producing Open Source Software" Karl Fogel describes this model of development as Consensus-based Democracy (in contrast to benevolent dictatorship): "Consensus simply means an agreement that everyone is willing to live with. It is not an ambiguous state: a group has reached consensus on a given question when someone proposes that consensus has been reached and no one contradicts the assertion. The person proposing consensus should, of course, state specifically what the consensus is, and what actions would be taken in consequence of it, if those are not obvious."

      What that means is that not only one person can take decisions but pretty much anyone can declare a final decision was made. It also means decisions can be stopped by individuals on the project.

      This model of development works well if what you want for your project is resilience to people, in particular those high up in the ranks, leaving at the cost of nobody having complete control. It means you are moving slower, at the benefit of getting more people on board and carrying on with your mission after you leave.

      There are a couple implications to this goal: If for whatever reason one single entity needs to retain control over the project, you better not enter the incubator like suggested here. Balancing control and longevity is particularly tricky if you or your company believes they need to own the roadmap of the project. It's also tricky if your intuitive reaction to hiring a new engineer is to give them committership to the project on their first day - think again keeping in mind that Money can't buy love. If you're still convinced they should be made committer, Apache probably isn't the right place for your project.

      Once you go through the process of giving up control with the help from your mentors you will learn to trust others - trust others to pick up tasks you leave open, trust others they are taking the right decision even if you would have done things otherwise, trust others to come up with solutions where you are lost. Essentially like Sharan Foga said to Trust the water.

      Even coming to the project at a later stage as an individual contributor you'll go through the same learning experience: You'll learn to trust others with the patch you wrote. You'll have to learn to trust others to take your bug report seriously. If the project is well run, people will treat you as an equal peer, with respect and with appreciation. They'll likely treat you as part of the development team with as many decisions as possible - after all that's what these people want to recruit you for: For a position as volunteer in their project. Doing that means starting to Delegate like a Pro as Deb Nicholson once explained at ApacheCon. It also means training your capability for Emmpathy like Leslie Hawthorn explained at FOSDEM. It also means treating all contributions alike.

      Their's one pre-requesite to all of this working out though: Working in the open (as in "will be crawled, indexed and made visible by the major search engine of the day"), giving control to others over your baby project and potentially over what earns your daily living means you need a lot of trust not onnly in others but also in yourself. If you're in a position where you're afraid that missteps will have negative repercussions on your daily life you won't become comfortable with all of that. For projects coming to the incubator as well as companies paying contributors to become open source developers in their projects in my personal view that's an important lesson: Unless committers already feel self confident and independent enough of your organisation as well as the team they are part of to take decisions on their own, you will run into trouble walking towards at least Apache.

      0 0

      We just wanted to share the latest status of the Camel in Action 2nd edition book here a few days before Christmas.

      We had hoped the book would have been done today. However there has been unexpected holdup due to the indexing of this big book is causing longer time to process than usual. In light of this Manning has pushed the schedule so the book is expected to be done by end of first week in January 2018.

      This gives us, the authors, an extra opportunity to review all the pages of the book, yet one more time, which would be our 5th review. It's unusual that a book has as many final reviews as we end up doing with this book. Its all good in the end as it means that higher quality and that we will correct cosmetic details that may otherwise have been missed.

      For example we spotted a glitch in figure 17.9, where the dashed box is not 100% horizontal in the bottom as shown in the following before vs after screenshots.

      Figure 17.9 - Before with the glitch in the dashed line in the bottom not being 100% horizontal line
      Figure 17.9 - After with the dashed lined in the bottom fixed

      Manning has told us that they expect the page total in the print book to be around 900. In addition we have two bonus chapters that are available for download of readers, which are 26 and 18 pages long. All together is nearly a double of the number of pages of the first edition which was already a big book.

      So hold on a little longer and we will be on a great start of 2018 with a up-to-date modern book covering all aspects of Apache Camel. The book uses the latest Camel release 2.20.1 at this time of writing, and we have some tips, what is coming in Camel 2.21 release, whenever relevant to our readers.

      Merry x-mas and happy new-year

      Claus Ibsen
      Jonathan Anstey

      0 0

      Finally the next Alfresco DevCon is arriving in January and I'm very proud to have the opportunity to contribute on this event as a trainer and a speaker. I'm pretty sure that also passing some good time with other Alfresco friends in Lisbon will be nice :)

      Training sessions - Tuesday 16th

      If you don't know nothing about Alfresco but you want to put your fingers on it, you can join the training sessions that will be held during the first day of the conference.

      This can be a good way for any newbies to become familiar with all the approaches and best practices needed for developing your content and business platform using Alfresco.

      We will cover both the Content Services and the Process Services platform so this is a fantastic occasion for learning the basics from people that works on it from a decade.

      The final session will be held by Ole Hejlskov (Developer Evangelist at Alfresco) and he will show you how to develop your application using ADF. Practically this is the most intensive and the most comprehensive training day that Alfresco has ever made!!!

      I'll work in partnership with the ECMCoreAveri Team together with my old friends Shoeb Sarguroh and Oliver Laabs, we are Certified Alfresco Instructors helping the Alfresco Training Team for delivering the introduction intensive course. We will support all the attendees on installing, configuring and developing labs during all the day.


      We hope that this intensive training day will be useful for all of you that want to start a project using Alfresco.

      Speaking about content migration - Thursday 18th

      After some years spent on contributing on Apache ManifoldCF inside the PMC we have started a new, and potentially overwhelming, swerve on the project: content migration.

      ManifoldCF is a repository crawler that had as a primary goal managing the indexing process using scheduled jobs. After a discussion in the community we realized that we could use it also for content migration and not only for searching purpose.

      Then we started to implement some Output Connectors dedicated to migrate contents. The first connector that I have implemented is the CMIS Output Connector. This means that you can migrate contents from any repos to any CMIS-compliant repo managing also the incremental process and the removed contents.

      I'm very happy to have started this new adventure and during this journey, I met one of the persons that allowed me to understand better the Alfresco platform and that gave me a lot in terms of knowledge and experience: Luis Cabaceira (Solutions Architect at Alfresco).

      I learnt a lot reading his white papers about sizing and performance tuning and now working strictly with him it is a huge and priceless thing for me, really, thank you man ;)

      Luis started to contribute on ManifoldCF implementing an Alfresco BFSI Output Connector for making easier any migration to Alfresco. He is also became a Committer inside the project and I'm sure that he will bring a lot of value and contributions taking care of some of the current Alfresco connectors and giving a huge help on the content migration area.

      We hope that our session will bring value and ideas to all of you and we really hope to receive any kind of feedback on our current work.

      This will help us for making a huge release of ManifoldCF that will be tagged as Content Migration Enabler scheduled on the next 2.10 version (around Q2 2018).

      See you at the conference for sharing awesome experiences or only just to say hello :)

      Below you will find the abstract of our presentation:

      Content migration using Apache ManifoldCF Output Connectors

      In our days, enterprise digital content is scattered on several independent systems and subsystems, which perform services such as user authentication, document storage, and provide search capabilities. Centralising enterprise data into a single repository is a growing necessity for organisations. Our talk proposes an approach that could be the "silver bullet" that will open a clear path for Enterprise digital content centralisation . In its genesis , Apache ManifoldCF is a crawler that allows you to manage content indexes in your search engines, this was the main goal of the product. We've realised that we could leverage ManifoldCF to also migrate content, and not only indexes, making it a very good migration tool. This talk will focus on 2 new output connectors for Apache ManifoldCF that are being developed by us.

      This is just the first slide of our presentation, stay tuned and come to see me and Luis speaking about content migration using Apache ManifoldCF

      This is just the first slide of our presentation, stay tuned and come to see me and Luis speaking about content migration using Apache ManifoldCF

      0 0

      A few weeks ago, I had the pleasure of hitting two excellent conferences in one week: SpringOne and The Rich Web Experience. The primary reason I like both conferences so much is that there are so many familiar faces.

      I had a gas hanging out with folks from Pivotal after I arrived on Monday night. On Tuesday, I thoroughly enjoyed the openingkeynote. Seeing the unveiling of Spring Boot 2.0's most impressive feature was spectacular too!

      I walked to the Okta office for some swag that afternoon, then proceeded to the Atomist happy hour. I talked with Rod Johnson about how Atomist might be able to help update our example apps and the Okta Developer blog. Since keeping our posts and examples up-to-date is a maintenance burden, I think Atomist could be a huge help.

      After happy hour, a bunch of us joined Heroku for a delicious dinner and fun conversations.

      On Wednesday, I delivered my talk on Bootiful Development with Spring Boot and React. You can find my slides on Speaker Deck.

      It was recorded and published to YouTube as well.

      After my talk ended, I only had 70 minutes before my flight took off for Florida and the Rich Web Experience. Luckily, there was hardly any traffic and I found myself boarding with 23 minutes to spare.

      The Rich Web Experience

      At the Rich Web Experience, I had two back-to-back talks on Thursday morning. The first was on OAuth and is modeled off my What the Heck is OAuth blog post. I was surprised to have a packed room and appreciated the enthusiastic audience. You can find my slides on Speaker Deck or view them below.

      I had an extra half-hour (compared to SpringOne) to deliver my Bootiful React talk, but I still managed to run out of time. The good news is it was largely because of audience interaction and questions. I feel like presentations are a lot more enjoyable when conversations happen during them. I published my slides afterward. The major difference between this deck and the one at SpringOne is I included Kent Dodds'free React courses on

      I took a nice stroll along the Clearwater beaches that afternoon. I felt like a huge weight had been lifted off my shoulders since I was done speaking for the year.

      On Friday, I flew back to Denver and spent the afternoon polishing all the READMEs in our developer example apps. We recently discovered that a lot of folks were trying our examples without reading our blog posts. As a developer, I know it's nice to clone a project, configure it, and run it. This should be much easier now. For example, if you look at the README for the okta-spring-boot-2-angular-5-example, you should be able to modify and run without reading its associated blog post.

      Devoxx4Kids Denver

      The next day, I helped organize a Devoxx4Kids Denver on building Fruit Ninja with Scratch. Melissa McKay was the class instructor and the kids had a blast. The workshop was hosted at Thrive Ballpark and they published a blog post about how Devoxx4Kids is Teaching Kids to Thrive.

      Team Building and Denver JUG

      The following week, I traveled to San Francisco to meet with my team and do some team building activities. I thouroughly enjoyed the stroll to work on Tuesday morning, and bowling that afternoon.

      I flew back Wednesday and made it just in time for the Denver JUG Holiday party. We had a pretty good turnout, announced some awards, voted on Venkat's talk in January, and gave out a few prizes. You can read more about the festivities on the Denver JUG blog.

      When I drove home that night, I felt like George Bailey rushing home at the end of It's a Wonderful Life! The joy of being home without travel on the horizon is a wonderful feeling.

      At the end of that week, I was able to find time to work on the Ionic Module for JHipster and release it.

      Home for the Holidays

      It's a great feeling to be home for the holidays. It was Trish's birthday weekend last weekend, so we watched her compete in a couple horse shows with Tucker. They sure do look good together, don't they?

      I'm done traveling for the year and I don't have any overnight travel scheduled until mid-February. My TripIt stats show I traveled quite a bit this year, and I'm looking forward to speaking at more JUGs and less conferences next year.

      2017 TripIt Stats

      I spent 141 days on the road, but I'm grateful for getting to attend so many cool conferences in many exotic locations.

      2017 Conferences

      If you want to tinker with some code over the break, you can checkout my blog post on how to use Spring Security 5.0 with OIDC or my buddy Nate's Spread Serverless Holiday Cheer with Lambda and API Gateway.

      Happy Holidays everyone!😊

      0 0
    • 12/24/17--16:22: Nick Kew: Clearing the air
    • It’s been a long time since I’ve blogged any good rant about matters in the news here.  It’s not that I don’t sometimes have things I could say, nor even that my words would be superfluous because the Chattering Classes in the mainstream media are already saying them.  Rather it’s a lack of round tuits, and perhaps because I might sometimes post a rant elsewhere instead (for example, El Reg on predominantly techie matters).

      So how better to try and restart than by blogging a positive story.  One of those rare occasions where out government appears possibly to be doing the Right Thing about one of today’s most serious problems.  I can’t find it on the BBC website (where I looked after hearing it on the radio), but Google finds it at the FT.

      The story is rather different between the BBC and the FT, but the gist of it is that Michael Gove and/or the Department of the Environment (of which he is minister in charge) is at last considering proposals to clean up our air, by restricting or banning domestic wood and coal fires.  These fires have become a huge problem in recent years.  I believe they have standards about keeping their own house unpolluted, but for anyone who happens to live downwind[1] of such fires, it can fill the house with smoke for extended periods: many hours a day, many months a year.  We’re talking levels of smoke comparable to not one or two but a great many smokers in the house, and this is seriously nasty smoke that hasn’t gone through the considerable cleanup that’s been forced onto the tobacco industry in recent decades.

      In summary, for people affected by this, it’s an order of magnitude worse than regular exposure to passive smoking, or to those diesel emissions that have created such a fuss in recent times.

      Governments have occasionally been known to do the right thing on pollution.  In the 1950s we had clean air legislation to clear up a reportedly-serious smog problem.  In my lifetime we’ve rid ourselves of most of the blight of tobacco smoke (including legislation that has been very successful despite my reservations at the time).  Let’s hope we can see the spirit of that 1950s legislation revived and give us back our air!

      [1] The prevailing wind here is approximately west-south-west, and a very common winter weather pattern includes mild damp weather and very light westerly winds.  So the greatest killer is to be between east and northeast of a woodburner.

      0 0

      I mentioned earlier that one could link RxJava2 Flowable with JAX-RS AsyncResponse with Subscriber which will do the best effort at streaming the data pieces converted to JSON array elements, see this example.

      That works but requires the application code refer to both JAX-RS AsyncResponse and CXF specific JsonStreamingAsyncSubscriber (RxJava2 specific at the earlier stage), as opposed to simply returning Flowable from the resource method.

      In meantime, John Ament added the initial Reactor integration code, and as part of this work John also provided the org.reactivestreams compatible JsonStreamingAsyncSubscriber to be optionally used with the CXF Reactor invoker.

      As a result we've found the opportunity to do some refactoring and introduce the simple org.reactivestreams utility module which is now reused between CXF RxJava2 invoker and Reactor invoker: the common invoker code both invokers delegate to will check if JSON is expected and if yes then will register JsonStreamingAsyncSubscriber as org.reactivestreams.Subscriber with org.reactivestreams.Publisher which can be either RxJava2 Flowable or Reactor Flux, or in fact - Java9 Flow.

      The end result is that users can now write simpler code by returning Flowable or Flux from the service methods. 

      It is an interesting but simple example of reusing the org.reactivestreams aware code between different org.reactivestreams implementations.

      0 0

      The Apache CXF has done some initial work to have OpenAPI v3 JSON reported from the JAX-RS endpoints.

      Andriy Redko has started with the OpenApiFeature implementation which depends on the latest OpenApi v3 aware swagger-core and swagger-jaxrs libraries and demoed it here.

      In meantime I did a Swagger2 to OpenApi v3 JSON conversion filter which reacts to openapi.json queries by converting Swagger2 swagger.json produced by Swagger2Feature to openapi.json with the idea of making it easier for the existing upstream code (which has already integrated Swagger2Feature) to start experimenting with OpenAPI v3, before doing the switch to the new feature (and dependencies).  

      This effort is still a work in progress but the results in both cases are promising. The new feature and the conversion filter will require some more improvements but you can start experimenting with them right now. And if you are someone like me then you will be positively surprised that SwaggerUI  3.6.1 and newer can handle both Swagger2 and OpenAPI v3 JSON :-).

      Enjoy !

      0 0

      I've been lacking the ideas on how to write the regular and last off-topic post of the year, thinking which piece of music I should link to.

      And then the inspiration came from the completely unexpected source.

      Those of you who follow Arsenal FC in the English Premier League know that Arsenal can either draw against Liverpool but in such a way that fans will remember it for years (yes, that Liverpool 4 to Arsenal 4 draw), or, most likely, lose badly to this Merseyside team, something like 1:4 or similar.

      So less than a week earlier, Arsenal was playing with Liverpool in London, losing 0:2. Oh well, most Arsenal fans thought, one of those days which can only be described in Fever Pitch. Then, in the 2nd half, after the few minutes, while the fans were having mince pies and tea,  Arsenal were 3:2 up, with Liverpool managing to equalize. The game saw many mistakes and brilliant moves and fans just had the day of the year watching the game. I liked this summary.

      How would I translate that to a New Year wish for you, the software engineers ? Here it is:

      Enjoy your work next year, try to do something extraordinary, something new, and don't be afraid to make mistakes :-)

      Happy New Year !

      0 0


      You've seen web sites with stock prices or retweet counts that update in real time. However, such sites are more the exception rather than the norm. WebSockets make it easy, and are widely supported, but not used as much as they could be.

      Examples provided for WebSockets typically don't focus on the "pubsub" use case; instead they tend to focus on echo servers and the occasional chat server. These are OK as far as they go.

      This post provides three mini-demos that implement the same design pattern in JavaScript on both the client and server.

      Quick Start

      For the impatient who want to see running code,

      git clone
      cd websocket-demos
      npm install
      node server.js

      After running this, visit http://localhost:8080/ in a browser, and you should see something like this:


      • one
      • two
      • three

      Server support

      The primary responsibility of the server is to maintain a list of active websocket connections. The code below will maintain three such sets, one for each of the demos provided.

      // attach to web servervar wsServer = new websocket.server({httpServer: httpServer});
      // three sets of connectionsvar connections = {
        text: new Set(),
        html: new Set(),
        json: new Set()
      // when a request comes in for one of these streams, add the websocket to the// appropriate set, and upon receipt of close events, remove the websocket// from that set.
      wsServer.on('request', (request) => {
        var url = request.httpRequest.url.slice(1);
        if (!connections[url]) {
          // reject request if not for one of the pre-identified paths
          console.log((new Date()) + '' + url + ' connection rejected.');
        // accept request and add to the connection set based on the request urlvar connection = request.accept('ws-demo', request.origin);
        console.log((new Date()) + '' + url + ' connection accepted.');
        // whenever the connection closes, remove connection from the relevant set
        connection.on('close', (reasonCode, description) => {
          console.log((new Date()) + '' + url + ' connection disconnected.');

      The code is fairly straightforward. Three sets are defined; and when a request comes in it is either accepted or rejected based on the path part of the URL of the request. If accepted, the connection is added to the appropriate set. When a connection is closed, the connection is removed from the set.


      Client Support

      The client's responsibitlity is to open the socket, and to keep it open.

      functionsubscribe(path, callback) {    
        var ws = null;
        var base =
        functionopenchannel() {
          if (ws) return;
          var url = new URL(path, base.replace('http', 'ws'));
          ws = new WebSocket(url.href, 'ws-demo');
          ws.onopen = (event) => {
            console.log(path + ' web socket opened!');
          ws.onmessage = (event) => {
          ws.onerror = (event) => {
            console.log(path + ' web socket error:');
            ws = null;
          ws.onclose = (event) => {
            console.log(path + ' web socket closed');
            ws = null;
        // open (and keep open) the channel
        setInterval(() => openchannel(), 2000);

      A subscribe method is defined that accepts a path and a callback. The path is used to construct the URL to open. The callback is called whenever a message is received. Errors and closures cause the ws variable to be set to null. Every two seconds, the ws variable is checked, and an attempt is made to reestablish the socket connection when this value is null.

      First example - textarea

      Now it is time to put the sets of server connections, and client subscribe function to use.

      Starting with the client:

      var textarea = document.querySelector('textarea');
      // initially populate the textarea with the contents of data.txt from the// server
      fetch("/data.txt").then((response) => {
        response.text().then((body) => { textarea.value = body })
      // whenever the textarea changes, send the new value to the server
      textarea.addEventListener('input', (event) => {
        fetch("/data.txt", {method: 'POST', body: textarea.value});
      // whenever data is received, update textarea with the value
      subscribe('text', (data) => { textarea.value = data });

      The value of the textarea is fetched from the server on page load. Changes made to the textarea are posted to the server as they occur. Updates received from the server are loaded into the textarea. Nothing to it!

      Now, onto the server:

      // Return the current contents of data.txt
      app.get('/data.txt', (request, response) => {
       response.sendFile(dirname + '/data.txt');
      // Update contents of data.txt'/data.txt', (request, response) => {
       var fd = fs.openSync(dirname + '/data.txt', 'w');
       request.on('data', (data) => fs.writeSync(fd, data));
       request.on('end', () => {
         response.sendFile(dirname + '/data.txt');
      // watch for file system changes.  when data.txt changes, send new raw// contents to all /text connections., {}, (event, filename) => {
        if (filename == 'data.txt') {
          fs.readFile(filename, 'utf8', (err, data) => {
            if (data && !err) {
              for (connection of connections.text) {

      Requests to get data.txt cause the contents of the file to be returned. Post requests cause the contents to be updated. It is the last block of code that we are most interested in here: the file system is watched for changes, and whenever data.txt is updated, it is read and the results are sent to each text connection. Pretty straightforward!

      If you visit http://localhost:8080/textarea in multiple browser windows, you will see a textarea in each. Updating any one window will update all. What you have is the beginning of a collaborative editing application, though there would really need to be more logic put in place to properly serialize concurrent updates.

      Second example - markdown

      The first example has the server sending plain text content. This next example deals with HTML. The marked package is used to convert text to HTML on the server.

      This client is simpler in that it doesn't have to deal with sending updates to the server:

      // initially populate the textarea with the converted markdown obtained// from the server
      fetch("/data.html").then((response) => {
        response.text().then((body) => { document.body.innerHTML = body })
      // whenever data is received, update body with the data
      subscribe('html', (data) => { document.body.innerHTML = data });

      The primary difference between this example and the previous one is that the content is placed into document.body.innerHTML instead of textarea.value.

      Like the client, the server portion of this demo consists of two blocks of code:

      app.get('/data.html', (request, response) => {
        fs.readFile('data.txt', 'utf8', (error, data) => {
          if (error) {
          } else {
            marked(data, (error, content) => {
              if (error) {
              } else {
      // watch for file system changes.  when data.txt changes, send converted// markdown output to all /html connections., {}, (event, filename) => {
        if (filename == 'data.txt') {
          fs.readFile(filename, 'utf8', (err, data) => {
            if (data && !err) {
              marked(data, (err, content) => {
                if (!err) {
                  for (connection of connections.html) {

      The salient difference between this example and the previous example is call to the marked function to perform the conversion.

      If you visit http://localhost:8080/markdown, you will see the text converted to markdown. You can also visit http://localhost:8080/ to see both of these demos side by side, in separate frames. Updates make in the window on the left will be reflected on the right.

      No changes were required to the first demo to make this happen as both demos watch for file system changes. In fact, you can edit data.txt on the server with your favorite text area and whenever you save your changes all clients will be updated.

      Final example - JSON

      In this final example, the server will be sending down a recursive directory listing, complete with file names, sizes, and last modified dates. On the client, Vue.js will be used to present the data. We start with a template:

        <tr v-for="file in filelist">
          <td>{{ }}</td>
          <td>{{ file.size }}</td>
          <td>{{ file.mtime }}</td>

      And add a bit of code:

      var app = new Vue({el: 'tbody', data: {filelist: []}});
      fetch('filelist.json').then((response) => {
        response.json().then((json) => { app.filelist = json });
      subscribe('json', (data) => { app.filelist = JSON.parse(data) });

      The first line associates some data (initially an empty array) with an HTML element (in this case tbody). The remaining code should look very familiar by now. Because of the way Vue.js works, all that is required to update the display is to update the data.

      The server side should also seem pretty familiar:

      app.get('/dir.json', (request, response) => {
   , {recursive: true}, (event, filename) => {
        var data = JSON.stringify(stats(dirname));
        for (connection of connections.json) {

      Not shown is the code that extracts the information from the filesystem, the rest is the same basic pattern that has been used for each of these demos.

      If you visit http://localhost:8080/filelist, you will see a table showing each of the files on the server. This list will be updated whenever you create, delete, or update any file. The server will push a new (and complete) set of data, and Vue.js will determine what needs to be changed in the browser window. All this generally takes place in a fraction of a second.

      Vue.js is only one such framework that can be used in this way. Angular, Ember.js, and React are additional frameworks that are worth exploring.


      By focusing on file system modified events, these demos have tried to demonstrate server initiated updates.

      With comparatively little code, web sites can be prepared to receive and apply unsolicited updates from the server. The granularity of the updates can be as little as a single string, can be a HTML fragment, or can be arbitrary data encoded in JSON.

      Reserving web sockets for server initiated broadcast operations can keep your code small and understandable. Traditional HTTP GET and POST requests can be used for all client initiated retrieval and update operations.

      This makes the division of labor between the client and server straightforward: the server is responsible for providing state -- both on demand and as the state changes. The client is responsible for updating the view to match the state.

      0 0

      I wish to share my anguish, that following Xerces bug has caused me:

      The bug reporter is very right in his arguments. But somehow I've to say, the Xerces team cannot fix this bug right now. I've also been thinking to "resolve" this bug with a fix reason "later" (but my conscience doesn't allow that either).

      I hope the situation will improve.

      0 0

      This blog post of mine was initially published by Computerworld UK in 2010.

      I’m amazed at how many so-called “enterprise software systems” do not embrace the Web model in 2010, making them way much harder and much less fun to use than they should be.

      I have recently started making parallels between this and music teachers, and the analogy seems to work. Don’t ask where the parallel comes from…weird connections in my brain I guess.

      Say you want to learn to play the guitar. Someone recommended Joe, who’s teaching in his downtown studio.

      You get there almost on time. Traffic. You find Joe’s studio and here he is, dressed in a neat and simple casual outfit. Smiling at you.

      Joe: Hey welcome! So you wanna learn to play?

      You: Yes. I brought my guitar, got it from my uncle. It’s a bit worn out as you can see.

      Joe: I see…well, you might want to get a better one if you continue past the first few lessons, but for now that will do! Do you have something that you would like to play to get started?

      You: “Smoke on the water”, of course. The opening line.

      Joe: Let’s try that then, I’ll show you! Just plug your guitar in this amplifier, and let me setup some nice effects so you get a cool sound.

      Joe plays the first few bars a few times, shows you how that works and you give it a try. Ten minutes later you start sounding half-decent and you’re having loads of fun playing together with Joe.

      Joe: Okay, you’re doing good! I’ll show you my rough course plan so you know what’s up next. I’m quite flexible when it comes to the curriculum – as long as you’re having fun and progressing we’ll be fine.

      It’s easy to imagine the bad teacher version of this story:

      • Unwelcoming
      • Complains because you’re three minutes late.
      • Wears a boring old-fashioned suit, and not willing to let you play that crappy old guitar.
      • Boring you with tons of scales before you can start playing a song.
      • Not giving you an overview of what comes next.
      • Not ready to compromise on His Mighty Standard Teaching Program.
      • Making you feel stupid about how bad a player you are.

      Bad software is like that bad teacher:

      • Hard to get started with.
      • Requires tons of specific client software of just the right version.
      • Requires you to enter loads of useless information before doing anything useful or fun.
      • Not willing to let you explore and do your own mistakes, and making sure you feel stupid when mistakes occur.

      The Web model is the way to go, of course.

      • Ubiquitous access.
      • Welcoming to various types of client software.
      • Easy to point to by way of permanent URLs.
      • Doing its best (fail whales anyone?) to keep you informed and avoid making you feel stupid when something goes wrong.
      • Letting you explore its universe with simple web-based navigation, and rewarding your efforts with new discoveries.

      This is 2010, and this is the Web. Don’t let any useless software stand between you and the information and services that you need.

      0 0

      This blog post of mine was initially published by Computerworld UK in 2010.

      As open source comes of age and becomes mainstream, more and more job postings include “open source skills” in their requirements.

      But do you really want to hire someone who spends their time exchanging flames with members of their own community in public forums? Someone who greets newcomers with “I have forwarded your question to /dev/null, thanks” and other RTFM answers?

      Luckily, open source communities are not just about being rude and unwelcoming to strangers. Most of them are not like that at all, and the skills you learn in an open source community can make a big difference in a corporate environment as well.

      One very important skill that you learn or improve in an open source community is to express yourself clearly in written form. The mailing lists or forums that we use are very limited compared to in-person communications, and extra care is required to get your message through. Being concise and complete, disagreeing respectfully, avoiding personal attacks and coping with what you perceive as personal attacks are all extremely useful skills on the job. Useful skills for your whole life actually.

      Once you master asynchronous written discussions as a way to build group consensus, doing the same in a face to face meeting can be much easier. But the basic skills are the same, so what you learn in an open source community definitely helps.

      Travel improves the mind, and although being active in open source can help one travel more, even without traveling you’ll be exposed to people from different cultures, different opinions, people who communicate in their second or third language, and that helps “improve your mind” by making you more tolerant and understanding of people who think differently.

      Not to mention people who perceive what you say in a different way than you expected – this happens all the time in our communities, due in part to the weak communications channels that we have to use. So you learn to be extra careful with jokes and sneaky comments, which might work when combined with the right body language, but can cause big misunderstandings on our mailing lists. Like when you travel to places with a different culture.

      Resilience to criticism and self-confidence is also something that you’ll often develop in an open source community. Even if not rude, criticism in public can hurt your ego at first. After a while you just get used to it, take care of fixing your actual mistakes if any, and start ignoring unwarranted negative comments. You learn to avoid feeding the troll, as we say. Once your work starts to produce useful results that are visible to the whole community, you don’t really care if someone thinks you’re not doing a good job.

      The technical benefits or working in open source communities are also extremely valuable. Being exposed to the work and way of thinking of many extremely bright developers, and quite a few geniuses, definitely helps you raise the bar on what you consider good software. I remember how my listening skills improved when I attended a full-time music school for one year in my youth: just listening to great teachers and fellow students play made me unconsciously raise the bar on what I consider good music.

      Open source communities, by exposing you to good and clever software, can have the same effect. And being exposed to people who are much better than you at certain things (which is bound to happen for anybody in an open source project) also helps make you more humble and realistic about your strengths and weaknesses. Like in soccer, the team is most efficient when all players are very clear about their own and other players’ strengths and weaknesses.

      You’ll know to whom you should pass or not pass the ball in a given situation.

      To summarise, actively participating in a balanced open source community will make you a better communicator, a more resilient and self-confident person, improve your technical skills and make you humbler and more realistic about your strengths and weaknesses.

      0 0

      It is no secret that, for awhile at least, Apache OpenOffice had lost its groove.

      Partly it was due to external issues. Mostly that the project and the committers were spending a lot of their time and energies battling and correcting the FUD associated around the project. Nary a week would go by without the common refrain "OpenOffice is Dead. Kill it already!" and constant (clueless) rehashes of the history between OpenOffice and LibreOffice. With all that, it is easy and understandable to see why morale within the AOO community would have been low. Which would then reflect and affect development on the project itself.

      So more so than anything, what the project needed was a good ol' shot of adrenaline in the arm and some encouragement to keep the flame alive. Over the last few months this has succeeded beyond our dreams. After an admittedly way-too-long period, we finally released AOO 4.1.4. And we are actively working on not only a 4.1.5 release but also preparing plans for our 4.2.0 release.

      And it's there that you can help.

      Part of what AOO really wants to be is a simple, easy-to-user, streamlined office suite for the largest population of people possible. This includes supporting old and long deprecated OSs. For example, our goal is to continue to support Apple OSX 10.7 (Lion) with our 4.2.0 release. However, there is one platform which we are simply unsure about what to do, and how to handle it. And what makes it even more interesting is that it's our reference build system for AOO 4.1.x: CentOS5

      Starting with AOO 4.2.0, we are defaulting to GIO instead of Gnome VFS. The problem is that CentOS5 doesn't support GIO, which means that if we continue with CentOS5 as our reference build platform for our community builds, then all Linux users who use and depend on those community builds will be "stuck" with Gnome VFS instead of GIO. If instead we start using CentOS6 as our community build server. we leave CentOS5 users in a lurch (NOTE: CentOS5 users would still be able to build AOO 4.2.0 on their own, it's just that the binaries that the AOO project supplies won't work). So we are looking at 3 options:

      1. We stick w/ CentOS5 as our ref build system for 4.2.0 but force Gnome VFS.
      2. We move to CentOS6, accept the default of GIO but understand that this moves CentOS5 as a non-supported OS for our community builds.
      3. Just as we offer Linux 32 and 64bit builds, starting w/ 4.2.0 we offer CentOS5 community builds (w/ Gnome VFS) IN ADDITION TO CentOS6 builds (w/ GIO). (i.e.: 32bit-Gnome VFS, 64bit-Gnome VFS, 32bit-GIO, 64bit-GIO).

      Which one makes the most sense? Join the conversation and the discussion on the AOO dev mailing list!

      0 0
    • 01/04/18--05:43: Steve Loughran: Speculation

    • Speculative execution has been intel's strategy for keeping the x86 architecture alive since the P6/Pentium Pro part shipped in '95.

      I remember coding explicitly for the P6 in a project in 1997; HPLabs was working with HP's IC Division to build their first CMOS-camera IC, which was an interesting problem. Suddenly your IC design needs to worry about light, aligning the optical colour filter with the sensors, making sure it all worked.


      I ended up writing the code to capture the raw data at full frame rate, streaming to HDD, with an option to alternatively render it with/without the colour filtering (algorithms from another bit HPL team). Which means I get to nod knowingly when people complain about "raw" data. Yes, it's different for every device precisely because its raw.

      The data rates of the VGA-resolution sensor via the PCI boards used to pull this off meant that a both cores of a multiprocessor P6 box were needed. It was the first time I'd ever had a dual socket system, but both sockets were full with the 150MHz parts and with careful work we could get away with the "full link rate" data capture which was a core part of the qualification process. It's not enough to self test the chips any more see, you need to look at the pictures.

      Without too many crises, everything came together, which is why I have a framed but slightly skewed IC part to hand. And it's why I have memories of writing multithreaded windows C++ code with some of the core logic in x86 assembler. I also have memories of ripping out that ASM code as it turned out that it was broken, doing it as C pointer code and having it be just as fast. That's because: C code compiled to x86 by a good compiler, executed on a great CPU, is at least performant as hand-written x86 code by someone who isn't any good at assembler, and can be made to be correct more easily by the selfsame developer.

      150 MHz may be a number people laugh at today, but the CPU:RAM clock ratios weren't as bad as they are today: cache misses are less expensive in terms of pipeline stalls, and those parts were fast. Why? Speculative and out of order execution, amongst other things

      1. The P6 could provisionally guess which way a branch was going to go, speculatively executing that path until it became clear whether or not the guess was correct -and then commit/abort that speculative code path.
      2. It uses a branch predictor to make that guess on the direction a branch was taken, based on the history of previous attempts, and a default option (FWIW, this is why I tend to place the most likely outcome first in my if() statements; tradition and superstition).
      3. It could execute operations out of order. That is, it's predecessor, the P5, was the last time mainstream intel desktop/server parts executed x86 code in the order the compiler generated them, or the human wrote them.
      4. register renaming meant that even though the parts had a limited set of registers, those OOO operations could reuse the same EAX, EBX, ECX registers without problems.
      5. It had caching to deal with the speed mismatch between that 150 MHz CPU & RAM.
      6. It supported dual CPU desktops, and I believe quad-CPU servers too. They'd be called "dual core" and "quad core" these days and looked down at.

      Being the first multicore system I'd ever used, it was a learning experience. First was learning how too much windows NT4 code was still not stable in such a world. NTFS crashes with all all volumes corrupted? check. GDI rendering triggering kernel crash? check. And on a 4-core system I got hold of, everything crashed more often. Lesson: if you want a thread safe OS, give your kernel developers as many cores as you can.

      OOO forced me to learn about the x86 memory model itself: barrier opcodes, when things could get reordered and when they wouldn't. Summary: don't try and be clever about synchronization, as your assumptions are invalid.

      Speculation is always an unsatisfactory solution though. Every mis-speculation is lost cycles. And on a phone or laptop, that's wasted energy as much as time. And failed reads could fill up the cache with things you didn't want. I've tried to remember if I ever tried to use speculation to preload stuff if present, but doubt it. The CMOV command was a non-branching conditional assignment which was better, even if you had to hand code it.  The PIII/SSE added the PREFETCH opcode so you could a non-faulting hinted prefetch which you could stick into your non-branching code, but that was a niche opcode for people writing games/media codecs &c. And as Linus points out, what was clever for one CPU model turns out to be a stupid idea a generation later. (arguably, that applies to Itanium/IA-64, though as it didn't speculate, it doesn't suffer from the Spectre & Meltdown attacks).

      Speculation, then: a wonderful use of transistors to compensate for how we developers write so many if() statements in our code. Wonderful, it kept the x86 line alive and so helped Intel deliver shareholder value and keep the RISC CPU out of the desktop, workstation and server businesses. Terrible because :"transistors" is another word for "CPU die area" with its yield equations and opportunity cost, and also for "wasted energy on failed speculations". If we wrote code which had fewer branches in it, and that got compiled down to CMOV opcodes, life would be better. But we have so many layers of indirection these days; so many indirect references to resolve before those memory accesses. Things are probably getting worse now, not better.

      This week's speculation-side-channel attacks are fascinating then. These are very much architectural issues about speculation and branch prediction in general, rather than implementation details. Any CPU manufacturer whose parts do speculative execution has to be worried here, even if there's no evidence that your shipping parts aren't vulnerable to the current set of attacks. The whole point about speculation is to speed up operation based on the state of data held in registers or memory, so the time-to-execute is always going to be a side-channel providing information about the data used to make a branch.

      The fact that you can get at kernel memory, even from code running under a hypervisor, means, well, a lot. It means that VMs running in cloud infrastructure could get at the data of the host OS and/or those of other VMs running on the same host (those S3 SSE-C keys you passed up to your VM? 0wned, along with your current set of IAM role credentials). It potentially means that someone else's code could be playing games with branch prediction to determine what codepaths your code is taking. Which, in public cloud infrastructure is pretty serious, as the only way to stop people running their code alongside yours is currently to pay for the top of the line VMs and hope they get a dedicated part. I'm not even sure that dedicated cores in a multicore CPU are sufficient isolation, not for anything related to cache-side-channel attacks (they should be good for branch prediction, I think, if the attacker can't manipulate the branch predictor of the other cores).

      I can imagine the emails between cloud providers and CPU vendors being fairly strained, with the OEM/ODM teams on the CC: list. Even if the patches being rolled out mitigate things, if the slowdown on switching to kernelspace is as expensive as hinted, then that slows down applications, which means that the cost of running the same job in-cloud just got more expensive. Big cloud customers will be talking to their infrastructure suppliers on this, and then negotiating discounts for the extra CPU hours, which is a discount the cloud providers will expected to recover when they next buy servers. I feel as sorry for the cloud CPU account teams as I do for the x86 architecture group.

      Meanwhile, there's an interesting set of interview questions you could ask developers on this topic.
      1. What does the generated java assembly for the Ival++ on a java long look like?
      2. What if the long is marked as volatile?
      3. What does the generated x86 assembler for a Java Optional<AtomicLong> look like?
      4. What guarantees do you get about reordering?
      5. How would you write code which attempted to defend against speculation timing attacks?

      I don't have the confidence to answer 1-4 myself, but I could at least go into detail about what I believed to be the case for 1-3; for #4 I should do some revision.

      As for #5, defending. I would love to see what others suggest. Conditional CMOV ops could help against branch-prediction attacks, by eliminating the branches. However, searching for references to CMOV and the JDK turns up some issues which imply that branch prediction can sometimes be faster...", including "JDK-8039104. Don't use Math.min/max intrinsic on x86" it may be that even CMOV gets speculated on; with the CPU prefetching what is moved and keeping the write uncommitted until the state of the condition is known.

      I suspect that the next edition of Hennessy and Patterson, "Computer Architecture, a Quantitative Approach" will be covering this topic.I shall look forward to with even greater anticipation than I have had for all the previous, beloved, versions.

      As for all those people out there panicking about this, worrying if their nearly-new laptop is utterly exposed? You are running with Flash enabled on a laptop you use in cafe wifis without a VPN and with the same password, "k1tten",  you use for gmail and paypal. You have other issues.

      0 0

      As the old year falls away and the new year boots up, it is traditional for people to write "old year retrospectives" as well as "new year predictions." Heck, I originally envisioned this entry as a duet of 2 separate blogs. But as much as I tried, it was just too difficult to keep them distinct and self-contained. There was simply too much overlap and as much as I expect "new things" in 2018, I think 2018 will mostly be a solidification of events and issues ramped up from 2017.

      So with all that in mind, I present my 2017-2018 Introspections... in no particular order:

      Continue reading "My 2017-2018 Introspections"

      0 0

      Peter Frankopan's The Silk Roads: A New History of the World is an extremely ambitious book.

      It sets out to survey, in a single 500 page volume, some 2000+ years of history of the region which, roughly speaking, spans from Turkey and Egypt to Mongolia and Pakistan in the one direction, and from Yemen to Russia in the other.

      That's a lot of land, and a lot of time, to cover.

      Certainly if you, like me, struggle to distinguish Basra from Bactria, Samarkand from Sanjan, Karakorum from Kashgar, Mosul from Mashad, Dushanbe from Dunhuang, or Istanbul from Isfahan (ok, well, that last one I knew), then you'll find a lot to learn in this history of human activity in Central Asia over the last few thousand years.

      And it's certainly a colorful book, full of great stories of traders, adventurers, explorers, merchants, prophets, and their interactions.

      (Attila the Hun! Genghis Khan! Richard Lionheart! The Black Death! Vasco da Gama! T.E. Lawrence! Timur! Marco Polo!)

      It's an immense scope, though, and Frankopan can barely get going on one episode before he races on to the next, breathless and impatient, rather like the White Rabbit: always in a hurry, but not quite sure where he's going.

      I didn't mind any of the minutes I spent with The Silk Roads, but in the end I'm afraid that this part of the world is still rather a blur to me, which is a shame, because I think that's precisely the problem that Frankopan set out to solve.

      Would he have been more successful (with me, at least), had he confined himself to a smaller region, or a shorter time period, the better to have used those pages to spend more time inhabiting particular incidents and characters? I'm not sure. I'm not much of a reader of histories, so I suspect this problem is just endemic to the genre, and it really just means that while his book was fascinating, I'm not really the target audience.

      0 0

      I think, there are various people using XML who like having XML data without any validation. I'm a strong proponent of having validation nearly always when using XML. Comparing the situation with RDBMS data, would make this clear I think (I don't mind proving things about a technology, taking cues from another technology which is hugely popular). Do we ever use data in RDBMS tables, without the schema (we don't)? The same should apply to XML, since validation is very closely defined alongside XML (DTD at least, and then XSD). If DTD or XSD is provided along with XML parsing, by the XML toolkit of choice, then why shouldn't we use validation whenever we're using XML -- as a consequence, we're working with a better design?

      Interestingly, validation doesn't always happen when using XML, because it hasn't been made mandatory in the XML language (like schemas with RDBMS). People using XML, sometimes like having XML data quickly transported between components or stored locally -- and they don't use validation in the process; which is fine since it meets the needs of an application.

      Sometimes, people using XML are influenced by how JSON is used. Presently, JSON doesn't has a schema language (but I came to know, that this may change in the future), and JSON is very popular & useful for certain use cases. Therefore, people try to use XML the same way -- i.e without validation.

      0 0

      Meltdown has made for an "interesting" week in computing, as everyone is learning about/revising their knowledge of Speculative Execution. FWIW, I'd recommend the latest version of Patterson and Hennessey, Computer Architecture A Quantitative Approach. Not just for its details on speculative execution, but because it is the best book on microprocessor architecture and design that anyone has ever written, and lovely to read. I could read it repeatedly and not get bored.(And I see I need to get the 6th edition!)

      Stokes Croft drugs find

      This weekend, rather than read Patterson and Hennessey(*) I had a go to see if you could implement the meltdown attack in Java, hence in mapreduce, spark, or other non-native JAR

      My initial attempt failed provided the part only speculates one branch in.

      More specifically "the range checking Java does on all array accesses blocks the standard exploit given steve's assumptions". You can speculatively execute the out of bounds query, but you can't read the second array at an offset which will trigger $L1 cache loading.

      If there's a way to do a reference to two separate memory locations which doesn't trigger branching range checks, then you stand a chance of pulling it off. I tried that using the ? : operator pair, something like

      String ref = data ? refA : ref B;

      which I hoped might compile down to something like

      mov ref, refB
      cmp data, 0
      cmovnz ref, refB

      This would do the move of the reference in the ongoing speculative branch, so, if "ref" was referenced in any way, trigger the resolution

      In my experiment (2009 macbook pro with OSX Yosemite + latest java 8 early access release), a branch was generated ... but there are some refs in the open JDK JIRA to using CMOV, including the fact that hotspot compiler may be generating it if it things the probability of the move taking place is high enough.

      Accordingly, I can't say "the hotspot compiler doesn't generate exploitable codepaths", only "in this experiment, the hotspot compiler didn't appear to generate an exploitable codepath".

      Now the code is done, I might try on a Linux VM with Java 9 to see what is emitted

      1. If you can get the exploit in, then you'd have access to other bits of the memory space of the same JVM, irrespective of what the OS does. That means one thread with a set of Kerberos tickets could perhaps grab the secrets of another. IT'd be pretty hard, given the way the JVM manages objects on the heap: I wouldn't know where to begin, but it would become hypothetically possible.
      2. If you can get native code which you don't trust loaded into the JVM, then it can do whatever it wants. The original meltdown exploit is there. But native code running in JVM is going to have unrestricted access to the entire address space of the JVM -you don't need to use meltdown to grab secrets from the heap. All meltdown would do here is offer the possibility of grabbing kernel space data —which is what the OS patch does.

      Anyway, I believe my first attempts failed within the context of this experiment.

      Code-wise, this kept me busy on Sunday afternoon. I managed to twist my ankle quite badly on a broken paving stone on the way to patisserie on Saturday, so sat around for an hour drinking coffee in Stokes Croft, then limped home, with all forms of exercise crossed off the TODO list for the w/e. Time for a bit of Java coding instead, as a break for what I'd been doing over the holiday (C coding a version of Ping which outputs CSV data and a LaTeX paper on the S3A committers)

      It took as much time trying get hold of the OS/X disassembler for generated code as it did coding the exploit. Why so? Oracle have replaced all links in which would point to the reference dynamic library with a 302 to the base Java page telling you how lucky you are that Java is embedded in cars. Or you see a ref to on-stack-replacement on a page in Project Kenai, under a URL which starts with, point your browser there and end up on and the message "We're sorry the site has closed."

      All the history and knowledge on JVM internals and how to work there is gone. You can find the blog posts from four years ago on the topic, but the links to the tools are dead.

      This is truly awful. It's the best argument I've seen for publishing this info as PDF files with DOI references, where you can move the artifact around, but citeseer will always find it. If the information doesn't last five years, then

      The irony is, it means that because Oracle have killed all those inbound links to Java tools, they're telling the kind of developer who wants to know these things to go away. That's strategically short-sighted. I can understand why you'd want to keep the cost of site maintenance down, but really, breaking every single link? It's a major loss to the Java platform —especially as I couldn't even find a replacement.

      I did manage to find a copy of the openjdk tarball people send you could D/L and run make on, but it was on a freebsd site, and even after a ./Configure && make, it broke trying to create a bsd dynlib. Then I checked out the full openjdk source tree, branch -8, installed the various tools and tried to build there. Again, some error. I ended up finding a copy of the needed hsdis-amd64.dylib library on Github, but I had to then spend some time looking at evolvedmicrobe's work &c to see if I could trust this to "probably" not be malware itself. I've replicated the JAR in the speculate module, BTW.

      Anyway, once the disassembler was done and the other aspects of hotspot JIT compilation clear (if you can't see the method you wrote, run the loop a few thousand more times), I got to see some well annotated x86-64 assembler. Leaving me with a new problem: x86-64 assembler. It's a lot cleaner than classic 32 bit x86: having more registers does that, especially as it gives lots of scope for improving how function parameters and return values are managed.

      What next? This is only a spare time bit of work, and now I'm back from my EU-length xmas break, I'm doing other things. Maybe next weekend I'll do some more. At least now I know that exploiting meltdown from the JVM is not going be straightforward.

      Also I found it quite interesting playing with this, to see when the JVM kicks out native code, what it looks like. We code so far from the native hardware these days, its too "low level". But the new speculation-side-channel attacks have shown that you'd better understand modern CPU architectures, including how your high-level code gets compiled down.

      I think I should submit a berlin buzzwords talk on this topic.

      (*) It is traditional to swap the names of the author on every use. If you are a purist you have to remember the last order you used.

      0 0

      Set up Ruby on Rails with Paperclip and S3 using AWS SDK
      Uploading Files to S3 in Ruby with Paperclip

      Paperclip requires the following gems added to your Gemfile.

      If your paperclip version is 5.1.0 then we are using 'aws-sdk' version 2.3.

      # Gemfile
      gem 'paperclip'
      gem 'aws-sdk', '~> 2.3'

      or our paperclip version is 4.1.0 then we need to use 'aws-sdk' version < 2  (note: add version less than 2.0 otherwise you will get paperclip error)

      gem 'paperclip'
      gem 'aws-sdk', '< 2.0'

      Run bundle install and restart the Rails server after modifying the Gemfile.

      Then run the command :-

      rails generate paperclip user image

      Define the file attribute in the Model

      class User < ActiveRecord::Base
          has_attached_file :image, styles: { medium: "300x300>", thumb: "100x100>" }, default_url: "/images/:style/missing.png"
            validates_attachment_content_type :image, content_type: /\Aimage\/.*\z/

      Migrations: Now our migration file look like :-

      class AddAvatarColumnsToUsers < ActiveRecord::Migration
        def up
          add_attachment :users, :image

        def down
          remove_attachment :users, :image

      View Page :-

      <%= form_for @user, url: users_path, html: { multipart: true } do |form| %>
        <%= form.file_field :image %>
      <% end %>

      Now in the controller :-

      def create
        @user = User.create( user_params )

      def user_params

      In our view page we can show the image using :-

      <%= image_tag @user.image.url %>
      <%= image_tag @user.image.url(:medium) %>
      <%= image_tag @user.image.url(:thumb) %>

      After that :- 
      S3 bucket Implementation:-
      1. Go to Aws Console
      2. Create the s3 bucket with bucket name and choose specific region.
      3. Get the app_key and app_secret (link:

      We’ll also need to specify the AWS configuration variables for the development/production Environment.

      # config/environments/production.rb
      config.paperclip_defaults = {
        storage: :s3,
        s3_credentials: {
          bucket: ENV.fetch('S3_BUCKET_NAME'),
          access_key_id: ENV.fetch('AWS_ACCESS_KEY_ID'),
          secret_access_key: ENV.fetch('AWS_SECRET_ACCESS_KEY'),
          s3_region: ENV.fetch('AWS_REGION'),


      Model :-

      class User < ActiveRecord::Base
           has_attached_file :image,:styles => { :icon => "50x50>", :small => "150x150", :medium => "300x300>", :thumb => "100x100>" }, :default_url => "/assets/icons/picture.png", :storage => :s3,:s3_credentials => "#{Rails.root}/config/aws_s3.yml",:url => ':s3_domain_url', :path=> ":attachment/:id/:style/:filename"

            validates_attachment_content_type :image, :content_type => /\Aimage\/.*\Z/

      # config/aws_s3.yml

      bucket: ENV.fetch('S3_BUCKET_NAME')
      access_key_id: ENV.fetch('AWS_ACCESS_KEY_ID')
      secret_access_key: ENV.fetch('AWS_SECRET_ACCESS_KEY')
      s3_region: ENV.fetch('AWS_REGION')

      WE CAN REFER :-

      0 0

      What impact on latency should you expect from applying the kernel patches for the Meltdown security vulnerability?

      TL;DR expect a latency increase of at least 20% for both reads and writes.


      The Meltdown vulnerability, formally CVE-2017-5754, allows rogue processes to access kernel memory. Simple demonstrations have already appeared online on how to expose passwords and ssh private keys from memory. The consequences of this, in particular on shared hosts (ie cloud) are considered “catastrophic” by security analysts. Initially discovered in early 2017, the vulnerability was planned to be publicly announced on the 9th January 2018. However, due to the attention generated by the frequency of Linux kernel ‘page-table isolation’ (KPTI) patches committed late in 2017 the news broke early on 3rd January 2018.


      Without updated hardware, the Linux kernel patches impact CPU usage. While userspace programs are not directly affected, anything that triggers a lot of interrupts to the CPU, such as a database’s use of IO and network, will suffer. Early reports are showing evidence of CPU usage taking a hit between 5% and 70%. Because of the potential CPU performance hit and lack of evidence available, The Last Pickle used a little time to see what impacts we could record for ourselves.


      The hardware used for testing was a Lenovo X1 Carbon (gen 5) laptop. This machine runs an Intel Core i7-5600U CPU with 8Gb RAM. Running on it is Ubuntu 17.10 Artful. The unpatched kernel was version 4.13.0-21, and the patched kernel version 4.13.0-25. A physical machine was used to avoid the performance variances encountered in the different cloud environments.

      The Ubuntu kernel was patched according to instructions here and the ppa:canonical-kernel-team/pti repository.


      A simple schema, but typical of many Cassandra usages, was used on top of Cassandra-3.11.1 via a 3 node ccm cluster. The stress execution ran with 32 threads. Running stress, three nodes, and a large number threads on one piece of hardware was intentional so to increase thread/context switching and kernel overhead.

      The stress run was limited to 5k requests per second so to avoid saturation, which occurred around 7k/s. The ratio of writes to reads was 1:1, with reads being split between whole partitions and single rows. The table used TWCS and was tuned down to 10 minute windows, so to ensure compactions ran during an otherwise short stress run. The stress ran for an hour against both the unpatched and patched kernels.

      ccm stress user profile=stress.yaml ops\(insert=2,by_partition=1,by_row=1\) duration=1h -rate threads=32 throttle=5000/s -graph file=meltdown.html title=Meltdown revision=cassandra-3.11.1-unpatched


      The following graphs show that over every percentile a 20%+ latency increase occurs. Sometimes the increase is up around 50%.

      Meltdown Cassandra median

      Meltdown Cassandra 95th

      Meltdown Cassandra 99th

      Meltdown Cassandra stats


      The full stress results are available here.

      0 0

      Berlin Buzzwords CFP is open, which, along with Dataworks Summit in April, is going to make Berlin the place for technical conferences in 2018.
      As with last year, I'm offering to review people's abstracts before they're submitted; help edit them to get the text to be more in the style that reviewers to tend to go for.

      When we review the talks, we look for interesting things in the themes of the conference, try and balance topics, pick the fun stuff. And we measure that (interesting, fun) on the prose of the submissions, knowing that they get turned into the program for the attendees: we want the text to be compelling for the audience.

      The target audiences for submissions then are twofold. The ultimate audience is the attendees. The reviewers? We're the filter in the way.

      But alongside that content, we want a diverse set of speakers, including people who have never spoken before. Otherwise it gets a bit repetitive (oh, no, stevel will talk on something random, again), and that's no good for the audience. But how do we regulars get in, given that the submission process is anonymous?

      We do it by writing abstracts which we know the reviewers are looking for.

      The review process, then, is a barrier to getting new speakers into the talk, which is dangerous: we all miss out on the insights from other people. And for the possible speakers, they miss out on the fun you have being a speaker at a conf, trying to get your slides together, discovering an hour in advance that you only have 20 min and not 30 for your talk and picking 1/3 of the slides to hide. Or on a trip to say, Boston, having your laptop have a hardware fault and you being grateful you snapshotted it onto a USB stick before you set off. Those are the bad points. The good bits? People coming up to you afterwards and getting into discussion about how they worked on similar stuff but came up with a better answer, how you learn back from the audience about related issues, how you can spend time in Berlin in cafes and wandering round, enjoying the city in early summer, sitting outside at restaurants with other developers from around Europe and the rest of the world, sharing pizza and beer in the the evening. Berlin is a fun place for conferences.

      Which is why people should submit a talk, even if they've never presented before. And to help them, feel free to stick a draft up on google docs & then share with edit rights to my gmail address, steve.loughran@ ;  send me a note and I'll look through.

      yes, I'm one of the reviewers, but in my reviews I call out that I helped with the submission: fairness is everything.

      Last year only one person, Raam Rosh Hai, took this offer up, And he got in, with his talk How to build a recommendation system overnight! This means that so far, all drafts which have been through this pre-review of submissions process, has a 100% success rate. And, if you look at the video, you'll see its a good talk: he deserved that place.

      Anyway, Submission deadline: Feb 14. Conference June 10-12.  Happy to help with reviewing draft abstracts.

      0 0

      Paperclip S3 AccessDenied

      Aws::S3::Errors::AccessDenied (Access Denied):

      Assuming you are using AWS’s IAM and you created a dedicated User for these uploads.

      If you get this error when trying to upload to S3, you need to assign this IAM User the “AmazonS3FullAccess” Policy.

      0 0

      Set up Ruby on Rails with Paperclip and S3 using AWS SDK

      Paperclip requires the following gems added to your Gemfile.

      If your paperclip version is 5.1.0 then we are using 'aws-sdk' version 2.3.

      gem 'paperclip'
      gem 'aws-sdk', '~> 2.3'

      or our paperclip version is 4.1.0 then we need to use 'aws-sdk' version < 2  (note: add version less than 2.0 otherwise you will get paperclip error)

      gem 'paperclip'
      gem 'aws-sdk', '< 2.0'

      Run bundle install and restart the Rails server after modifying the Gemfile.


      0 0

      A short lesson, this time; perhaps our education is nearing completion and we are moving on to become graduate students?

      Clearlake Capital Acquires Perforce Software

      Clearlake Capital Group, L.P. (together with its affiliates, “Clearlake”) today announced that it has acquired Perforce Software (“Perforce” or the “Company”), developer of the industry’s most flexible, scalable and secure version control and collaboration platform, from growth equity investor Summit Partners. The Company will continue to be led by Janet Dryer, CEO, and Mark Ties, COO, who will both join the Board of Directors alongside Clearlake. Financial terms were not disclosed.

      0 0

      • google/highwayhash: Fast strong hash functions: SipHash/HighwayHash

        HighwayHash: ‘We have devised a new way of mixing inputs with AVX2 multiply and permute instructions. The multiplications are 32×32 -> 64 bits and therefore infeasible to reverse. Permuting equalizes the distribution of the resulting bytes. The internal state occupies four 256-bit AVX2 registers. Due to limitations of the instruction set, the registers are partitioned into two 512-bit halves that remain independent until the reduce phase. The algorithm outputs 64 bit digests or up to 256 bits at no extra cost. In addition to high throughput, the algorithm is designed for low finalization cost. The result is more than twice as fast as SipTreeHash. We also provide an SSE4.1 version (80% as fast for large inputs and 95% as fast for short inputs), an implementation for VSX on POWER and a portable version (10% as fast). A third-party ARM implementation is referenced below. Statistical analyses and preliminary cryptanalysis are given in’ (via Tony Finch)

        (tags: siphashhighwayhashvia:fanfhashinghashesalgorithmsmacgooglehash)

      • Brain Cells Share Information With Virus-Like Capsules – The Atlantic

        …a gene called Arc which is active in neurons, and plays a vital role in the brain. A mouse that’s born without Arc can’t learn or form new long-term memories. If it finds some cheese in a maze, it will have completely forgotten the right route the next day. “They can’t seem to respond or adapt to changes in their environment,” says Shepherd, who works at the University of Utah, and has been studying Arc for years. “Arc is really key to transducing the information from those experiences into changes in the brain.” Despite its importance, Arc has been a very difficult gene to study. Scientists often work out what unusual genes do by comparing them to familiar ones with similar features—but Arc is one-of-a-kind. Other mammals have their own versions of Arc, as do birds, reptiles, and amphibians. But in each animal, Arc seems utterly unique—there’s no other gene quite like it. And Shepherd learned why when his team isolated the proteins that are made by Arc, and looked at them under a powerful microscope. He saw that these Arc proteins assemble into hollow, spherical shells that look uncannily like viruses. “When we looked at them, we thought: What are these things?” says Shepherd. They reminded him of textbook pictures of HIV, and when he showed the images to HIV experts, they confirmed his suspicions. That, to put it bluntly, was a huge surprise. “Here was a brain gene that makes something that looks like a virus,” Shepherd says. That’s not a coincidence. The team showed that Arc descends from an ancient group of genes called gypsy retrotransposons, which exist in the genomes of various animals, but can behave like their own independent entities.* They can make new copies of themselves, and paste those duplicates elsewhere in their host genomes. At some point, some of these genes gained the ability to enclose themselves in a shell of proteins and leave their host cells entirely. That was the origin of retroviruses—the virus family that includes HIV.

        (tags: brainevolutionretrovirusesvirusesgenesarcgagproteinsmemorybiology)

      0 0

      Andy Weir had the debut novel sensation that, surely, every novelist dreams of: The Martian was a world-wide best-seller, stayed on the best-seller lists for almost two years, and was then adapted to become one of the top ten movies of 2015.

      You can only imagine what a life-changing experience this must have been for a guy who spent 15 years writing novels while working full time.

      Anyway, Weir is now back with his second novel: Artemis.

      In various creative fields, people talk about the "sophomore slump", and it surely can't have been easy for Weir to figure out how he wanted to write his next book. I'm sure he was also feeling pressure from both his readers and his publisher to hurry up and deliver another book.

      So he did.

      Artemis is certainly not the book that The Martian was.

      However, both as a standalone effort and as a companion piece, it is quite interesting.

      And, as you should probably grow to expect from Weir, it's a rollicking roller-coaster adventure ride of a book.

      But while The Martian was a book about humans who were in space, and wanted to get back to Earth, Artemis is a book about people who were on Earth, and have decided that they want to live in space.

      Weir is very interested in the notion of what it would mean for humans to be living somewhere other than on Earth, which is indeed a fascinating thing to think about, and Artemis is of most interest when you look at it from that viewpoint.

      Artemis, as it turns out, spends most of its time spinning tales of completely ordinary experiences that have much more to do with being human beings, than with being in outer space. Rather than being just a sterile laboratory occupied by scientists, as so many "outer space" books are, Weir's outer space civilization is full of everything that makes us human. There are bars, casinos, and night clubs; there are prostitutes, drug dealers, and smugglers; there are petty rivalries, dirty laundry, and double-dealing.

      But, most of all, there are complex systems, and, as was true with The Martian, it is when dealing with interesting complex systems that Weir's book is at its most interesting (even if great literature it ain't):

      He wiggled his hand. "That wasn't just you. There were a lot of engineering failures. Like: Why aren't there detectors in the air pipeline for complex toxins? Why did Sanchez store methane, oxygen, and chlorine in a room with an oven? Why doesn't Life Support have its own separate air partition to make sure they'll stay awake if the rest of the city has a problem? Why is Life Support centralized instead of having a separate zone for each bubble? These are the questions people are asking.

      Moreover, as Weir observes, these aren't actually engineering questions at their root; they are questions about how we organize our societies, a question which is just as important and relevant in outer space as it is here on Earth:

      "The next big step is taxes."

      "Taxes?" I snorted. "People come here because they don't want to pay taxes."

      "They already pay taxes -- as rent to KSC. We need to change over to a property-ownership and tax model so the city's wealth is directly tied to the economy. But that's not for a while."

      She took off her glasses. "It's all part of the life-cycle of an economy. First it's lawless capitalism until that starts to impede growth. Next comes regulation, law enforcement, and taxes. After that: public benefits and entitlements. Then, finally, overexpenditure and collapse."

      "Wait. Collapse?"

      "Yes, collapse. An economy is a living thing. It's born full of vitality and dies once it's rigid and worn out. Then, through necessity, people break into smaller economic groups and the cycle begins anew, but with more economies. Baby economies, like Artemis is right now."

      Although Artemis ultimately fails as a work of literature, it is promising as a hint of what Weir is interested in, and where he might go.

      Humans in space is a fascinating concept, and thinking about it realistically, rather than in some fantastic sterile implausible laboratory fashion, is how we're going to get to a point where we're actually ready to have humans in space. Building space ships and sending people out in them is just an engineering problem, and we'll solve that, probably pretty soon. But economics, politics, crime, government? These are actually HARD problems.

      Writing about them, thinking about them, sharing those ideas, is one way to make it real, and for that, if for nothing else, I enjoyed reading Artemis and will look forward to Weir's next work.

      0 0

      Gentoo has given more love to its sys-firmware/intel-microcode package. It's now easier than ever to update microcode automatically early on boot. This is a valuable alternative to shipping microcode within the BIOS. I mean honestly, who regularly checks for BIOS updates? Much easier by doing that through an ebuild.

      The new USE flag initramfs now builds a cpio archive at /boot/intel-uc.img ready to be used by grub. In /boot/grub/grub.cfg:

      menuentry 'Gentoo Linux 4.14' {
        linux /boot/linux-4.14.12 root=LABEL=ROOT ro rootfstype=ext4 net.ifnames=0
        initrd /boot/intel-uc.img /boot/initrd.img
      Note how the microcode intitramfs is simply prepended to the boot initramfs (initrd). A kernel that has microcode laoding support enabled will find it there, upload the microcode into the cpu and then discard the initramfs blob, and continue booting with the initrd.img. The first line in your dmesg output will show:
      microcode: microcode updated early to revision 0x80, date = 2018-01-04

      0 0

      0 0

      One of the challenges of running large scale distributed systems is being able to pinpoint problems. It’s all too common to blame a random component (usually a database) whenever there’s a hiccup even when there’s no evidence to support the claim. We’ve already discussed the importance of monitoring tools, graphing and alerting metrics, and using distributed tracing systems like ZipKin to correctly identify the source of a problem in a complex system.

      Once you’ve narrowed down the problem to a single system, what do you do? To figure this out, it’s going to depend on the nature of the problem, of course. Some issues are temporary, like a dead disk. Some are related to a human-introduced change, like a deployment or a wrong configuration setting. These have relatively straightforward solutions. Replace the disk, or rollback the deployment.

      What about problems that are outside the scope of a simple change? One external factor that we haven’t mentioned so far is growth. Scale can be a difficult problem to understand because reproducing the issue is often nuanced and complex. These challenges are sometimes measured in throughput, (requests per second), size (terabytes), or latency (5ms p99). For instance, if a database server is able to serve every request out of memory, it may get excellent throughput. As the size of the dataset increases, random lookups are more and more likely to go to disk, decreasing throughput. Time Window Compaction Strategy is a great example of a solution to a scale problem that’s hard to understand unless the numbers are there. The pain of compaction isn’t felt until you’re dealing with a large enough volume of data to cause performance problems.

      During the times of failure we all too often find ourselves thinking of the machine and its processes as a black box. Billions of instructions executing every second without the ability to peer inside and understand its mysteries.

      Fortunately, we’re not completely blind as to what a machine is doing. For years we’ve had tools like debuggers and profilers available to us. Oracle’s JDK offers us Java Flight Recorder, which we can use to analyze running processes locally or in production:

      mission control

      Profiling with flight recorder is straightforward, but interpreting the results takes a little bit of work. Expanding the list of nested tables and looking for obvious issues is a bit more mental work than I’m interested in. It would be a lot easier if we could visualize the information. It requires a commercial license to use in production, and only works with the Oracle JDK.

      That brings us back to the subject of this post: a way of generating useful visual information called a flame graph. A flame graph let’s us quickly identify the performance bottlenecks in a system. They were invented by Brendan Gregg. This is also part one of a very long series of performance tuning posts, we will be referring back to it as we dive deeper into the internals of Cassandra.

      Swiss Java Knife

      The approach we’ll examine in this post is utilizing the Swiss Java Knife, usually referred to as SJK, to capture the data from the JVM and generate the flame graphs. SJK is a fantastic collection of tools. Aside from generating flame graphs, we can inspect garbage collection statistics, watch threads, and do a variety of other diagnostic tasks. It works on Mac, Linux, and both the Oracle JDK and the OpenJDK.

      I’ve downloaded the JAR, put it in my $HOME/bin and set up a shell function to call it easily:

      sjk (){
      	java -jar ~/bin/sjk-plus-0.8.jar "$@"}

      On my laptop I’m running a workload with cassandra-stress. I’ve prepopulated the database, and started the stress workload with the following command:

      cassandra-stress read n=1000000

      For the first step of our analysis, we need to capture the stack frames of our running Java application using the stcap feature of SJK. To do this, we need to pass in the process id and the file to which we will dump the data. The dumps are written in a binary format that we’ll be able to query later:

      sjk stcap -p 92541 -i 10ms -o dump.std

      Then we can analyze the data. If all we have is a terminal, we can print out a histogram of the analysis. This can be pretty useful on it’s own if there’s an obvious issue. In this case, we can see that a lot of time is spent in sun.misc.Unsafe.park, meaning threads are just waiting around, parked:

      $ sjk ssa -f dump.std --histo
      Trc     (%)  Frm  N  Term    (%)  Frame
      372447  96%  372447       0   0%
      309251  80%  309251  309251  80%  sun.misc.Unsafe.park(Native Method)
      259376  67%  259376       0   0%  java.util.concurrent.locks.LockSupport.park(
      254388  66%  254388       0   0%
       55709  14%  55709        0   0%  java.util.concurrent.ThreadPoolExecutor$
       52374  13%  52374        0   0%  org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$6/ Source)
       52374  13%  52374        0   0%  org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(
       44892  11%  44892        0   0%  io.netty.util.concurrent.DefaultThreadFactory$
       44887  11%  44887        0   0%  java.util.concurrent.ThreadPoolExecutor.runWorker(
       42398  11%  42398        0   0%
       42398  11%  42398        0   0%  io.netty.util.concurrent.SingleThreadEventExecutor$
       42398  11%  42398        0   0%
       42398  11%  42398        0   0%
       42398  11%  42398        0   0%
       42398  11%  42398    42398  11% Method)
       42398  11%  42398        0   0%
       42398  11%  42398        0   0%

      Now that we have our stcap dump, we can generate a flame graph with the following command:

      sjk ssa --flame-f dump.std > flame-sjk.svg

      When you open the SVG in a browser, you should end up with an image which looks something like this:


      If you open the flame graph on your machine you can mouse over the different sections to see the method call and percentage of time it’s taking. The wider the bar, the more frequent it’s present in the stacks. It’s very easy to glance at the graph to understand where the time is spent in our program.

      This is not the only technique for generating flame graphs. Brendan Gregg has a long list of links and references I recommend reading at the bottom of his FlameGraph page. I intend to write a utility to export the SJK format to the format that Brendan uses on his blog as it’s a little nicer to look, has a better mouseover, supports drill down, and has a search. They also support differential flame graphs, which are nice if you’re doing performance comparisons across different builds.

      I hope you’ve enjoyed this post on visualizing Cassandra’s performance using FlameGraphs. We’ve used this tool several times with the teams we’ve worked with to tune Cassandra’s configurations and optimize performance. In the next post in this series we’ll be examining how to tune garbage collection parameters to maximize throughput while keeping latency to a minimum.

      0 0

      • The likely user interface which led to Hawaii’s false-alarm incoming-ballistic-missile alert on Saturday 2018-01-13

        @supersat on Twitter: “In case you’re curious what Hawaii’s EAS/WEA interface looks like, I believe it’s similar to this. Hypothesis: they test their EAS authorization codes at the beginning of each shift and selected the wrong option.” This is absolutely classic enterprisey, government-standard web UX — a dropdown template selection and an easily-misclicked pair of tickboxes to choose test or live mode.

        (tags: testinguxuser-interfacesfaileashawaiifalse-alarmsalertsnuclearearly-warninghuman-error)

      • The Death of Microservice Madness in 2018

        Quite a good set of potential gotchas, which I’ve run into myself, including: ‘Real world systems often have poorly defined boundaries’ ‘The complexities of state are often ignored’ ‘The complexitities of communication are often ignored’ ‘Versioning can be hard’ ‘Microservices can be monoliths in disguise’

        (tags: architecturedevopsmicroservicesservicessoacodingmonolithsstatesystems)

      • Do algorithms reveal sexual orientation or just expose our stereotypes?

        ‘A study claiming that artificial intelligence can infer sexual orientation from facial images caused a media uproar in the Fall of 2017. […] Michal Kosinski, who co-authored the study with fellow researcher Yilun Wang, initially expressed surprise, calling the critiques “knee-jerk” reactions. However, he then proceeded to make even bolder claims: that such AI algorithms will soon be able to measure the intelligence, political orientation, and criminal inclinations of people from their facial images alone.’ ‘In [this paper], we have shown how the obvious differences between lesbian or gay and straight faces in selfies relate to grooming, presentation, and lifestyle? — ?that is, differences in culture, not in facial structure. […] We’ve demonstrated that just a handful of yes/no questions about these variables can do nearly as good a job at guessing orientation as supposedly sophisticated facial recognition AI. Therefore?—?at least at this point?—?it’s hard to credit the notion that this AI is in some way superhuman at “outing” us based on subtle but unalterable details of our facial structure.’

        (tags: culturefacial-recognitionaipapersfacial-structuresexual-orientationlgbtcomputer-vision)

      • Shanzhai ?? China & its Contents

        As he drinks Sino-coffee for around RMB 10, Comrade X might well be wearing the latest ‘ZARE’ couture while watching the TV news streaming on his HiPhone.[2] Back in Guangdong, his girlfriend — a sales consultant at a small stall in one of Shenzhen’s many wholesale electronics markets — sports a ‘high-end replica’ ?? Louis Vuitton bag and makes a living selling ‘domestically produced’ ?? and ‘smuggled’ ?? smartphones. The imitation products that festoon the couple’s lives are part of ‘shanzhai ?? China’. Shanzhai, the word means roughly ‘mass-produced imitation goods’, has created a Chinese landscape that is littered with products derided by the media, Chinese and international, as ‘copycat’, ‘guerrilla counterfeits’ and ‘knockoffs’, all the work of thieves.[3] Those who feel that their intellectual property and copyright has been infringed by shanzhai producers describe the products as ‘rubbish’, ‘piracy in disguise’ and ‘hooligan’.[4] Regardless of such righteous outrage, shanzhai — the producers, the products and the mentality — continues to flourish as an essential, quasi-legitimate shadow dimension of the Chinese economy. And, in practical terms, shanzhai products give disenfranchised ‘non-consumers’ of the orthodox economy — that is, people who would like to own but can’t afford the ‘original’ products — cut-price access to high-end technologies, as well as offering aspirational shoppers consumer satisfaction.

        (tags: shanzaichinafakesconsumerismhiphonesmartphonescopycatknockoffsimitationconsumption)

      • Don Norman on “Human Error”, RISKS Digest Volume 23 Issue 07 2003

        It is far too easy to blame people when systems fail. The result is that over 75% of all accidents are blamed on human error. Wake up people! When the percentage is that high, it is a signal that something else is at fault — namely, the systems are poorly designed from a human point of view. As I have said many times before (even within these RISKS mailings), if a valve failed 75% of the time, would you get angry with the valve and simply continual to replace it? No, you might reconsider the design specs. You would try to figure out why the valve failed and solve the root cause of the problem. Maybe it is underspecified, maybe there shouldn’t be a valve there, maybe some change needs to be made in the systems that feed into the valve. Whatever the cause, you would find it and fix it. The same philosophy must apply to people.

        (tags: don-normanuxuihuman-interfacehuman-errorerrorsriskscomp.risksfailures)

      0 0

      0 0

      For the last few years my favorite web site had become The Awl.

      And now, no more.


      I hope all those EXTREMELY talented writers and editors find good new locations elsewhere.

      0 0

      Calling the rails API from simple HTML page.

      Step 1:  Create new form in the HTML page.


      <form action="javascript:void(0)">
        <input type="text" name="first_name" id="first_name">
        <select name="resident" id="resident">
          <option value="">Select</option>
          <option value="true">true</option>
          <option value="false">false</option>
        <textarea type="textarea" name="notes" id="notes"></textarea>
        <input type="file" name="image" id="image">
        <button type="button" name="submit" id="form_submit_api">Send Data</button>
      <script src="" integrity="sha256-DZAnKJ/6XZ9si04Hgrsxu/8s717jcIzLy3oi35EouyE=" crossorigin="anonymous"></script>

      Step 2:  Write the script for passing the form data and post the data content to the Controller action using ajax.

      <script type="text/javascript">
          $("#form_submit_api").on("click", function(){
            var formData = new FormData();
            formData.append('image', $('input[type=file]')[0].files[0]);
            formData.append('first_name', $("#first_name").val());
            formData.append('resident', $("#resident").val());
            formData.append('notes', $("#notes").val());
              type: "POST",
              url: "http://localhost:3000/api/v1/users",
              data: formData, 
              processData: false,
              contentType: false,
              success: function(response) {
                window.location.href = window.location.href;
              error: function(response) {

      Now passing the image and other form data in the FormData :-

      0 0

      Last month our 2010 27″ iMac stopped working. There was no drama, just a black screen where previously had been the High Sierra backdrop. A quick bit of investigation showed that it wasn’t as simple as a dead machine. The fans still blew and when turned off and on it made all the expected noises. There was just no picture. Plugging in an external monitor soon showed that the machine was usable and still functioning normally. After 7 years of service it seemed as if the LCD panel had finally failed.

      The law of murphy stalks all such events and this time was no different. We’re in the midst of building a house so buying a replacement iMac wasn’t really an option. Looking at the pricing and specifications the offerings, none seemed like value for money – even if we had the money and wanted to splash out. However, as the machine was my wifes day to day workhorse, we would soon miss the abilities it offered. A solution was needed.

      When I built my desktop computer 15 months ago I chose standard, well supported components. Given the travel my work entails it often sits idle, so a possible low cost solution would have been for my wife to use my machine. Her MacBook Air is very old and frankly not very pleasant to use for anything beyond mail and surfing the web while my laptop is more than capable. Of course, my desktop runs Windows and she is happier with the world of Apple. But maybe…

      The Dark Side beckons…

      Hackintosh is a term I’ve seen a lot, but when I last looked it was very complex and involved a lot of jumping through hoops and using very specific components. Looking again in 2017 revealed how far things had evolved. A lot of reviews and blogs talked about success dual booting a single, home built machine with OSX and Windows. They all mentioned the tonymacx86 website which has a lot of information, tutorials and downloads to help.
      Using a monitor I found a suitable USB memory stick, registered with tonymacx86, downloaded the files and shortly afterwards had the USB stick ready for a try. I’ll be honest, I didn’t expect much when I plugged it and powered up my desktop. The black screen with the Apple logo was a surprise. The progress bar slowly filling in was an even bigger surprise but when I was asked to choose an installation language I was glad to be sitting. Perhaps this could work?

      Needing somewhere to install OSX on my desktop I found an older disk and after moving some wiring around had it installed and ready to receive OSX. Powering up again and going through the installtion instructions from the tonymacx86 website proved to be childs play and it didn’t take long for the install to commence. The reboots took me a little by surprise, but the process ran without any issues and soon I was filling in usernames and viewing the High Sierra backdrop.
      Was everything perfect? No. That would have been too much to expect but I was sitting wondering what I was missing as up to this point it had been too simple.
      Continuing with the tonymacx86 installation instructions I ran the MultiBeast app. This presented the first questions that caused me to pause. What drivers did I actually need? Looking at the OSX Preferences app it was clear no network or sound card had been recognised, so drivers would be needed for those. Ticking the boxes that seemed applicable and installing the Clover bootloader all went as expected and I rebooted – to a black screen.


      This was, ironically, the same situation as the existing iMac – a responsive computer with a black screen. The fact it had worked previously meant it wasn’t a total roadblock and after some research I copied the USB stick EFI folder over the same folder on the installed bootloader. Rebooting rewarded me with a login screen and working network! The sound card had been recognised and was listed but as my sound is via the HDMI cable I still wasn’t hearing anything. Having a 4K monitor on my desktop has been great, but when running OSX it wasn’t great as the font size was far too small. Without a way to change the display font using the machine was mildly frustrating despite the change from the i3 2010 processor to the i7 2016 processor being very evident.

      I tried a few things but wasn’t able to get the sound working via HDMI after trying a large number of different approaches outlined by people. The community seems to be gaining cohesion and the tools are certaiinly improving but I still found myself looking at version incompatibilities, outdated and incomplete instructions that were often filled with abbreviations that meant nothing to me. Clover is an interesting tool but isn’t as user friendly as many claim it to be.


      We discovered a fix for the black screen on the iMac during this process. It was a bug related to the iMac goiong to sleep. Armed with this knowledge we were able to restore it to full working order, removing the need for me to continue. While I haven’t taken this any further when the iMac does eventually cease to be usable I will seriously consider building a Hackintosh. A quick costing with suitable components showed that it would save around £1000 over a comparable iMac, resulting in a machine that could be expanded and upgraded.