Trace: start


This is the Bertrand Florat's personal page. You will find here some links to articles or projects I'm involved in and few thoughts (mainly about IT). Use the contact page if you want to discuss about an article, I had to disable comments due to massive spamming.

“In theory, there is no difference between theory and practice. But in practice, there is.” -Yogi Berra

Technology blog

V3 modèle de dossier d'architecture

Voir https://github.com/bflorat/modele-da

Le modèle a été augmenté, simplifié et corrigé. Surtout, il prend la voie d'une documentation vivante en étant repris en asciidoc (il sera donc maintenant possible de proposer des merge requests par exemple). Les diagrammes sont toujours en Plantuml mais la plupart ont été repris en diagrammes C4.

Retours et PR appréciés

2019/09/01 02:00 · bflorat

One month with Ansible

Ansible is an Open Source IT automation tool written in python and sponsored by RedHat. Best known alternatives are Puppet, Chef and Salt.

I used Ansible for the first time (2.4.3, last release in early 2018) in an attempt to produce some quite sophisticated Docker Swarm docker-compose files and others yaml configuration files that includes a significant volume of logic (port number increments, conditional suffixes, variable number of sections according to lists of items, etc.)

I achieved my goals in about five or six days of effective work, including the reading of most of the official manual. Be able to achieve such a real task in six days is acceptable when we have to learn it first but I think I would have made it in a single day in bash (that I already know). However, Ansible is much more powerful. My first contacts and real works with Ansible were really enjoyable and I was very surprised to make it work so easily. I also tried to apply all the documented best practices with success. Sadly, I spent the last three days struggling with the last 5% of remaining work, dealing with limitations/bugs that I found hard to understand and quite irritating.

What I liked

  • The concept of desired state is very powerful: Ansible playbooks (list of tasks to performed against some servers) are idempotent : only the final states have to be described (like “ a /tmp/foo directory with 600 rights), not the actions required to reach it (like in bash : mkdir, chown, chmod…). It's powerful partially because you don't have to test existence of the final state (in a bash in exit on error mode, you would have to check existence of each directory for instance).
  • Ansible is agentless : nothing to install on targeted servers. All you need is an ssh key exchange to allow the headless ssh connections. Ansible generates python scripts from the playbook, copy them using scp or sftp and run them remotely using ssh as well.
  • The role concept is a kind of operation process packaged (like “add a mysql user” or “create and configure an Apache server”). It enables a lot of reuse and is really great. A marketplace of shared roles is available on Galaxy.
  • The manual and reference documentation is good and extensive.

What I found irritating

UPDATE November 2019 : all of the issues described here has been resolved in the mean time by the Ansible team, KUTGW !

  • I don't like yaml for complex structures. I find it harder to read than json and syntax errors are very frequent and occur a great waste of time. The data structures are described by (space) indentation I found brittle. Worst : different indentation forms can be both valid but mean different things (like a map of map or one more key/value for the current map). Validators exist but AFAIK, formatters doesn't. However, yaml comes with fine features like comments or multi-documents.
  • Playbooks execution is rather slow because of a new ssh connection for each task + one for the generated python scripts sending to remote host. Note however that even if tasks are always executed sequentially, the tasks are run in parallel against all the targeted servers.
  • You need to create a playbook that just wrap a role to run it, you cannot launch a role directly from command line
  • I wasn't able to increment a variable inside a loop in a jinja2 templates : https://github.com/pallets/jinja/issues/641 . This is a feature, not a bug. Incrementing things (like ports) is nevertheless a very basic requirement IMO. Hopefully, there is a workaround (using a list, append and pop).
  • It is difficult to debug the templating, especially when using templates fragments (with import). On any template module error, you only get the playbook line and the full template content (very difficuly to read BTW).
  • I find the syntax sometimes twisted, like when we have to use doubles quotes around variables and sometimes not. Also, why should we add white space around the variables names ? (like {{ myvar }} ). I find this ugly and annoying. Apparently, we can drop the spaces in playbooks but not in the jinja2 templates…
  • Ansible is not compatible with python 3.0 to 3.5. Sometimes (like with the copy module), I didn't get any error message despite the fact that the python package on the target server was unsupported.
  • It is not possible to copy recursively with src_remote (https://github.com/ansible/ansible/issues/14131). I had to use a hack (run template on the Ansible host using connection: local ) and then to copy using src instead of src_remote.

Final thoughts

As a conclusion, Ansible is a good product but can become cumbersome when trying to make it run too much logic. It is mainly a declarative system, not imperative. Next time, we'll have a look at salt, it may be a more suitable solution, or maybe not ?

2018/02/03 20:59 · bflorat

Summary of Cal Newport's "Deep Work" book

I just finished "Deep work", an interesting book. I only regret it doesn't contain any reference concerning the pomodoro technique.

Here's my few raw notes :

Deep work : “professional activities performed in a state of distraction-free concentration that push cognitive capabilities to their limit”. For high skills, difficult to replicate.
Shallow work : “non cognitive demanding, logistic-style tasks, often performed while distracted.” Low value, easily replicable 
Deep work hypothesis : the ability to perform a deep work is rare and valuable. Those who are capable will thrive. 
The core abilities : 
- quickly master hard things
- produce elite level with speed
Both depends on deep work

Myelin : by triggering always the same paths, better signal -> more focus = more intelligence
High quality work = time x intensity of focus 

Metric black hole : we don't actually measure value of tasks we perform
Principe of least resistance : given that we don't actually measure value of our work, we do first what is easier : shallow work.
Busyness as a proxy for productivity : in knowledge works, difficult to estimate our own value : a lot of shallow work makes false feeling of produced value 
Cult of the Internet : everything from the Internet (like facebook) is considered a piori as good in IT :  hugh error.
Neuroscience : what you are is the sum of what you focus on. Happier when we focus on flow activities. We need goals, challenges, feedback.
We all have a limited amount of will-power so we need to save it for deep work.

Profiles of deep workers:
- bimodal : monastic-like activities for few days, shallow work during the rest of the time
- rhythmic philosophy : moment reserved every day, use a chain method like a cross on the calendar : we want to avoid any hole in the chain.
- journalist philosophy : switches between shallow work and deep work all the day long (hard) 

Ideas to help deep work:
- grand gesture : leave habits, work in an hotel for ie
- help serendipity by meeting people from others disciplines
- stop to work the evening to let the unconscious mind to solve problems for you (less work = more CPU to solve problems in your mind background)
- also rest because we all have a limited amount of available attention
- perform of shutdown ritual every end of day (like saying 'work performed') -> brain conditioned to stop running thoughts. Otherwise, Zeigarnik effect (we remember better interrupted tasks because we want to solve it)
- search boredom to help the brain to rewire
- schedule the day by blocks, change blocks during the day if required 

Deep work meditation to solve complex problems:
- Store variables of current state of the problem
- ask question to force the brain to go to the next problem and no looping
- fight distracted thoughts

Memorization technique (see the book for more details) : imagine large objects in 5 rooms of our house, map the objects with a set of celebrities and imagine scenes. Each person maps a value (like a number of a card value) 

Avoid any-benefits tools like facebook, concentrate on craftsman approach : only consider tools that help significantly to reach the lead goals 
To determinate if a tool that help : 
- list the key activities you need to realize to reach the lead goals 
- for each activities, ask yourself if the tool helps or not

4DX (Four disciplines of eXecution) :
- focus on widely import goals (measurable few goals)
- focus on lead goals, not long term goals
- use scoreboards
- perform periodic summaries 

Law of the vital fews (Pareto principle) : 80% of a given effect is done by 20% of the possible causes
During leisure, avoid using Internet, do high-level activities like reading literature
Evaluate shallow work performed by week and confront it to your boss and ask him to validate.
To determine if a work is shallow : how many months would it take to teach an hypothetical post graduate to make it ?
Say "no" by default, provide vague explanation to avoid questions.
Process centric e-mails to close the loop and free the mind : state clearly the next steps on every subject (every action)
Avoid replying to e-mails on subject without interest, coming with too much work to reply etc..
2016/11/04 20:41 · bflorat

Dashboard under XFCE real howto

If like me you like both XFCE and Gnome-Shell dashboard/ window picker, here's how I configured my desktop for the nearest most Gnome-like experience :

1) Install xfdashboard (the dashboard itself). I ised version 0. Note : this release comes with a hot corner plugin, no more need to use xdotool or brightside.

2) Add or enable these commands to be run at X startup (in XFCE Settings / Sessions and startup / application autostart ) : xfdashboard -d (deamon mode for a faster display)

3) Configure XFdashboard using xfdashboard-settings :

  • In 'plugins', select the 'hotcorners' plugin
  • make sure to restart xfdashboard to enable this new plugin : xfdashboard -q, then xfdashboard -d &

4) Add the preferred applications into the vertical side bar (no GUI, xfce4-settings-editor cannot edit arrays), here's a sample command :

xfconf-query -c xfdashboard -p /favourites -n -t string -s "exo-file-manager.desktop" -t string -s "exo-terminal-emulator.desktop" -t string -s "jetbrains-idea-ce.desktop" -t string -s "owncloud.desktop" -t string -s "simple-scan.desktop" -t string -s "gnome-calculator.desktop" -t string -s "firefox.desktop" -t string -s "thunderbird.desktop" -t string -s "zim.desktop" -t string -s "libreoffice-writer.desktop"

5) If you are in multi-monitors mode and you want to see all windows on the primary display and not spread on several monitors, see my workaround : in /usr/share/themes/xfdashboard/xfdashboard-1.0/xfdashboard.css (or in the others themes xfdashboard.css files) , change filter-monitor-windows: true; to filter-monitor-windows: false;

2016/10/24 16:12 · bflorat

Benefits of Hardware-based Full Disk Encryption and sedutil

sedutil.jpg We need to protect our personal or professional data, especially when located on laptops that can easily be stolen. Even if it is not yet fully widespread, many companies or personal users encrypt their disks to prevent such issues.

They are three major technologies to encrypt the data (most of the time, the same symmetric cipher is used:AES 128 or 256 bits) :

  • Files-level encryption tools (7zip, GnuPG, openSSL…) where we encrypt one or more files (but not a full file system)
  • Software FDE = Full Disk Encryption (dm-crypt, encfs, TrueCrypt under Linux ; BitLocker, SafeGuard under MS Windows among many others) where a full file system is encrypted. Most of these softwares map a real encrypted file system to a in-memory clear filesystem. For instance, you open an encrypted /dev/sda2 filesystem with dm-crypt/Luks this way :
sudo cryptsetup luksOpen /dev/sda2 aClearFileSystemName  
<enter password>
mount /dev/mapper/aClearFileSystemName /mnt/myMountPoint
  • Hardware-based Full Disk Encryption (also named SED = Self-Encrypting Disk) where hard disk encrypt themselves in their own build-in disk controller. We'll focus here on this technology.

To make it work, you need :

  • a SED-capable hard disk or SSD (I for one own a Samsung 840 PRO and a 850 EVO that support it, most professional disks do).
  • a compatible BIOS that support SED. You can then set a disk-level user password in the BIOS (and optionally an administrator password to unlock the user password). When the computer boots, the BIOS asks interactively for a disk password [1]. Note that many BIOS (especially on desktops or on non-professional laptops) doesn't support this feature because the constructor has not enable it (maybe to avoid customer complaints about password loss ?).

Once the correct BIOS disk password entered, the disk becomes totally 'open' (we say 'unlocked'), exactly like it has never been encrypted. None software is involved afterward. It is important to understand than a SED always encrypts the data. There is no way to disable this behavior (however, it doesn't cause any significant effect on the IO performance however because the IO volume is unchanged and because the disk controller comes with a build-in AES chipset). The real encryption key (MEK = Media Encryption Key) is located inside the disk itself (but cannot be accessed). The user password (named KEK = Key Encryption Key) is used to encrypt / decrypt the MEK. Keeping the disk password unset is like keeping a safe open : the data is still encrypted but decrypted when accessing the disk exactly as if none security system ever existed. When you set the user password, you close the safe door using your key. Note that there is no (known) way to recover a disk if you loose your password : you not only loose your data but you also loose your disk : it becomes a piece of junk from where none data can be read or written to.

I used dm-crypt (the default FDE software under Linux) for my own laptop until soon as I bought a SED-enabled Samsung SSD but I never managed to use them on my own computer because my AMI BIOS doesn't support this feature. The only option then was to use a software file system encryption. This works but comes with several complications or drawbacks :

  • you need a /boot partition in clear to bootstrap the process. An attacker can easily alter this partition and add keyloggers for instance ;
  • you have to change some kernel options and make sure to set the right modules loading order at startup or resume (ans keep them when updating the kernel) ;
  • the TRIM SSD feature [2] is now supported by dm-crypt but it comes with security concerns ;
  • you need dm-crypt commands on liveCD distros when performing system backups.

The only benefit of using software FDE I can think of is the possibility to check the cipher code source (when using an open source solution like dm-crypt of course). This is not the case of hardware encryption even if none severe issue has been reported so far AFAIK.

SED hardware-based disks are much simpler to use in comparison :

  • you only have to set a BIOS password and it's done !
  • it is possible to destroy definitively a drive by changing its password once for all when decommissioning a laptop for instance (but it is also a drawback when the password is lost unintentionally).

But :

  • once unlocked, the disk remains in this state while the computer is powered (this include while suspended on RAM). Login window doesn't change anything : an attacker can read the drive by plugging directly to the SATA port (DMA attack) and even worse ; a warm reboot (a restart) keeps the drive open ! It means that one can access the unlocked disk simply by inserting a Live CD/USB and rebooting the computer. The Live CD/USB is booted and all the drive data is available when mounted ! ; This is why, when using SED, you should always hibernate (suspend-on disk) instead of suspending on RAM : when hibernating, the drive actually loses power and is locked again. Of course, you'll get the same effect when turning off your computer.
  • you need a SED-capable BIOS. Note that you can also use the hdparm command to unlock a SED drive but it requires to boot a Live CD/USB. Then launch something like the command bellow and then restart your computer. However, it is not actually practicable ;
sudo hdparm --user-master u --security-set-pass 'pass' /dev/sdb
  • if you loose the disk password, the disk is simply dead (but is may be a benefit as stated before) ;
  • you may depend of a special BIOS manufacturer because it trims or hash the disk password (KEK). Another BIOS may use another algorithm. It means that moving a drive from a computer to another may lead to be unable to unlock the drive, even with the same password.
  • because the operating system and its settings is not yet booted, only the QUERTY keyboard layout is available, you have to keep this in mind when choosing and typing it ;
  • you have to trust the hardware security chipsets.

The OPAL specification published by the Trusted Computing Group (AMD, IBM, Intel, HP…) fixes some of these issues :

  • you can always save the disk when loosing the disk password (of course, data is still lost, fortunately) thanks the PSID Revert function (the PSID is a number printed on the disk proving than you can physically access the drive) ;
  • the KEK hashing and triming is now standard : the same drive could be moved from a computer to another :
  • you can use SED even without BIOS support because OPAL comes with a mechanism called 'shadow MBR'. Basically, you flash a mini-OS (the PBA = Pre-Boot Authorization) up to 128MB to a dedicated area of the disk. This OS is provided to the BIOS when booting. A password window is then displayed. If the password is correct, the real MBR of the drive (the Master Boot Record = boot code) is then decrypted and executed. No more need for BIOS SED support and even better : a new open source OPAL implementation (sedutil) is available and its code source can be reviewed much more easilly than the BIOS binary firmware.

The new sedutil project comes with :

  • some PBA images ready to flash to the drive
  • the sedutil-cli command to administer the OPAL disk (setting up a drive in OPAL configuration, changing the password, PSID revert…) . Note that these commands requires to set libata.allow_tpm=1 to the kernel flags if run from an installed Linux. You can also, like me, use sedutil-cli from a rescue image booted from USB. See the list of commands. See also how to Setup a drive.

This worked perfectly for me and I now use my Samsung 850 EVO drive in SED OPAL mode. Note that sedutil doesn't support suspend on RAM (when resuming, the drive is as if it was dead, you'll get IO errors all over the place). Always use hibernation instead (as I already stated, it's the only safe way to use SED drives anyway).

[1] Note that it has nothing to do with the main BIOS user password that “protect” your machine (then your disk data is still in clear and can be read simply by moving it to another computer or by removing the BIOS battery)

[2] TRIM is used for SSD to free ASAP unused blocks and increase the disk lifespan.

2015/11/03 21:26 · bflorat

The IT crowd, entropy killers

http://www.channel4.com/programmes/the-it-crowd

I once asked myself “how to define our job in the most general sense of the term, we, computer scientists ?”.

Our fields are very diverse but according to me, the greatest common divisor is “entropy hunter”.

All we do have is geared toward the same goal : decrease the level of complexity of a system by modeling it and transforming a bunch of semi-subjective rule into a Turing machine program that can't execute the indecisive.

Everything we do, including documentations, workshops with the stakeholders, project aspects, and not only the programming activities should be about chasing doubt. Every word, every single line of code should kill ambiguity.

Take design activities : most of human thoughts are fuzzy. This is the reason why waterfall (traditional) project management processes where all designs are done in one go can't work : the humans need to see something to project themselves using it and go further in their understanding.

Business designs are subjective in many ways, for instance :

  • by describing missing cases (or less often, unexisting cases)
  • by words ambiguity. Here's a small anecdote : last week, I worked on a specification document written in French with the word : “soit” : “the file contains two kinds of data, soit data1 and data2”. This sentence could be understood in two opposite ways because the French word “soit” means “either/or” but also “i.e.”. Hence, this sentence could mean at the same time “the file contains data1 AND data2 kinds” or “the file contains data1 OR data2 types”. I encounter this kind of uncertainty several times a week.
  • by lacking of examples. The example are often much more demanding and objectionable. They require a better understanding of the system. Moreover, designing by the example (like in BDD) tend to be more complete because when you start to provide nominal examples, you are tempted to provide the corner case ones. (read BDD in Action by John Ferguson Smart for more).

On the opposite, a program is deterministic. It is a more formal (and modeled thus reduced) version of a complex reality. The more a reality need cases and rules to be described entirely, the more the program is complex but it is still much simpler than the reality it describes.

The quality of all we do should IMO be measured in the light of the amount of complexity we put into our programs. The less complexity we used to model a system, the better a program is.

2015/03/02 21:23 · bflorat

Deployment scripts gain should always be refreshed from VCS prior execution

After few months of continuous deployment scripts writing for a pretty complex architecture (two JBoss instances, a mule ESB instance, one database to reset, a BPM server, each being restarted in the right order and running from different servers), I figured out a good practice in this field : scripts have to be auto-updated.

When dealing with highly distributed architectures, you need to install this kind of deployment script (mostly Bash) on every involved node and it becomes soon very cumbersome and error prone to maintain them on every server.

We now commit them into a VCS (Subversion in our case), it is the master location of the scripts. Then, we try :

  1. To checkout them before running when possible. For instance, we used a Jenkins job to launch our deployment process (written as a bash script). The job is parametrized to checkout the SVN repository for the script before running it from the Jenkins workspace. This is very convenient.
  2. When this is not possible (for instance when the script should be executed on another server than the CI server), we checkout the script from the Jenkins server and push them (using scp for instance) to targeted server before executing it (using ssh).
  3. Sometimes, when the call must be asynchronous on another server, we simply trigger a script by creating remotely an empty file. A very simple croned bootstrap script (not refreshed itself) detect the file change, update the script (svn co) and run it.
2015/02/12 20:30 · bflorat

Retours Eclipse DemoCamp 2015 Nantes

J'ai eu le plaisir de me rendre à l'Eclipse DemoCamp Nantes jeudi dernier au Hub Creatic (il est difficile à trouver car pas encore indiqué, c'est le bâtiment jaune vif à coté de l'école Polytech Nantes. C'était la première fois que je m'y rendais et je dois dire que j'ai été impressionné, dommage qu'il ne soit pas en centre ville).

Nous avons eu un panorama extrêmement éclectique mais passionnant du monde Eclipse en 2015, de l'internet des objets (IOT) à l'usine logicielle de grands groupes en passant par l’informatique pour les enfants. Ceci montre, si besoin était, la force de traction du monde Eclipse en tant qu'IDE bien sûr mais surtout en tant que plate forme.

Gaël Blondelle de la fondation Eclipse

l'a très bien expliqué : la force d'Eclipse est avant tout sa capacité fédératrice : la version Luna a été réalisée par 400 développeurs issus de 40 sociétés différentes.

La notion de release train (livraison simultanée et annuelle de tous les projets en juin) assure une stabilité et une intégration de qualité entre les centaines de plugins.

Une notion émergente concerne également les Working groups regroupant des travaux par thème comme :

  • LocationTech orienté SIG . Un des projets les plus innovants de ce groupe est Mobile Map générant des cartes directement calculées sur le smartphone.
  • IOT fédérant les projets autour de l'internet des objets. Deux projets intéressants : Eclipse Smart Home pour la domotique et Eclipse SCADA proposant des librairies et outils SCADA (Supervisory Control and Data Acquisition) servant au monitoring de nombreux hardwares.
  • Eclipse Science pour des projets de visualisation ou de traitements scientifiques.
  • PolarSys regroupe des projets pilotés par Thales, le CEA, Airbus, Ericson… pour les projets de modélisation autour de l'embarqué (Papyrus SysML, Capella…).

Laurent Broudoux et Yann Guillerm, architectes au MMA

nous ont ensuite exposé l'historique de déploiement et leur stratégie multi-versions d'Eclipse. Leur DSI regroupe 800 personnes dont 150 utilisateurs d'Eclipse travaillant sur des projets aussi variés que du legacy (Cobol, Flex, Java historique) à des projets plus novateurs (mobile, applications Web à base de Grails…).

En résumé, la construction d'une nouvelle version de l'atelier (unique jusqu'en 2012) prenait jusqu'à 50JH en partant d'Eclipse de base et en intégrant/testant tous les plugins nécessaires. La nouvelle stratégie se décline en deux axes :

  1. Construire un atelier modulaire en trois couches : 1) une base (seed) : distribution Eclipse pré-packagée ; 2) des plugins communautaires (Confluence, Mylyn, Subclipse…) 3) des plugins maison principalement autour des outils de groupware Confuence.
  2. Différentier les ateliers suivants les besoins (6 variantes, 4 familles) :
  • une usine legacy à base de Galileo
  • une usine « High Tech » pour les « usages » (CMS, mobile) basée sur la distribution Grails GGTS (et bientôt intégrant les technologies Android ADT) ;
  • une usine « cœur de métier » basée sur Juno (je n'ai pas noté la distribution Eclipse utilisée comme seed) pour les applications JEE (fournit les briques techniques de persistance SQL et NoSql, la modélisation UML, les outils MDD (Acceleo, ATD…), M2E pour l'intégration Maven… ;
  • une usine de modélisation d'architecture pour gérer le patrimoine, les études d'impacts, la déclinaison des scénarios projets… Cette modélisation d'entreprise se base sur un Metamodèle dérivé de TOGAF. Cet atelier se base sur la seed SmartEA (d'OBEO).

La stack technologique des usines s’appuie notamment sur Mylyn (gestion des taches), Confluence (wiki d'entrerise), Maven, Chef pour le Configuration Management, SVN comme VCS.

Stéphane Bégaudeau d'OBEO

nous a ensuite présenté les outils de développement et d'intégration de l'écosystème NodeJS. Le scaffolding par archétypes se fait via l'outil Yeoman . Le package manager des librairies JS est npm. On dispose également des librairies/frameworks angular.js, ember.js et backbone.js. Bower est un gestionnaire de packet pour librairies JS. Le build se fait soit avec Grunt (modèle configuration over code) ou (préféré), le plus récent Gulp (code over configuration) plus simple. Min.js assure des fonctions de minification du code. Pour les tests, on dispose de Jasmine (BDD), Mocha, Qunit. PhantomJS et CasperJS permettent des tests headless. Istanbul assure l'analyse de la couverture de code. JSHint effectue les tests de style. Karma teste l'ubiquité des pages (responsive design). Pour finir, Stéphane nous présente Eclipse Orion, l'IDE Eclipse Web basé sur NodeJS. Cet IDE assure entre autres la complétion du code, la coloration syntaxique, vient avec un très bon support de Git et peut être étendu par plugins.

Hugo Brunelière d'Atlanmod

nous a fait découvrir le programme de recherche ARTIST proposant des outils d'ingénierie des modèles et des méthodologies pour migrer une application traditionnelle en application cloud-friendly. Le programme de 10M€ est développé principalement par l'INRIA et ATOS (Spain). Le programme propose :

  • De la méthodologie via un handbook, un modèle de certification.
  • Des outils d'analyse de faisabilité métier et technique, de la rétro-ingénierie, des outils d'optimisation.

La modélisation est faite en UML stéréotypé sous Enterprise Architect principalement. Des DSL textuels à base de XText sont également utilisés ainsi que des DSL graphiques à base de SIRIUS. L'analyse M2T est faite via Modisco et la transformation de modèle M2M en ATL. Le reporting se base sur BIRT. La méthodologie est outillée par EPF (Eclipse Process Framework). Un modèle de maturité cloud-friendly a été développé : le modèle MAT (Maturity Assessment Tool).

Stévan Le Meur de Codenevy

nous a fait une démonstration de Eclipse CHE , une plate-forme SaaS pour les développeurs basée sur Orion et Docker. Un poste de développement peut être très facilement provisionné puis “déployé” en Web pur (c'est la notion de « Factory Codenevy »). Il est possible de sélectionner puis d’exécuter des containers Docker faisant tourner des SA Tomcat, Jboss ou autre en local ou à distance. Une nouvelle fonctionnalité d'intégration GitHub en avant première (clone puis pull request en quelques clics sans rien installer) a fini de nous bluffer.

Maxime Porhel d'OBEO

nous a présenté un environnement de programmation graphique pour les cartes Arduino et à destination des enfants. Ce DSL graphique très simple a bien sûr été développé en SIRIUS (la version Open Source d'OBEO Designer). Une démonstration très rigolote a prouvé le concept sur une carte Arduino AVR inclus dans un kit DFRobots. Ce sont mes enfants qui vont être contents :-)

Enfin,

Fred Rivard, fondateur de IS2T

nous a expliqué les enjeux économiques et technologiques du Java embarqué. 100Md de micro-contrôleurs de 1 à 15$ sont actuellement déployées au niveau mondial. 25 % tournent en environnements « balisés » : iOS, Android et Linux. Le reste est est extrêmement éclaté en centaines de technologies sur lesquelles on programme encore en assembleur. Le ticket de départ d'un projet se chiffre à 1M€ minimum et il faut sortir le produit en moins de six mois pour être rentable vis à vis de la concurrence. Le Big Data ne pourra se développer harmonieusement que si le Little data (les devices, l'IOT) qui l'alimente devient plus économique. IS2T vise à développer des JVM embarquées extrêmement rapides et légères en mémoire (rentable en terme de mémoire à partir de 100K de mémoire flash vis à vis de code classique). Toutes ces technologies sont regroupées autour de la plate-forme MicroEJ). IS2T développe également un « store » d'applications embarquées pour ce type de hardware. Fred nous a présenté de façon ludique de nombreux exemples d'utilisation comme cette montre connectée qui s'allume en 48ms alors qu'il faut 500ms pour soulever son bras pour lire l'heure : la montre peut donc être arrêtée le plus clair de son temps, son autonomie est en décuplée.

A noter quelques apartés en séance

sur Oomph, un nouvel installer pour les plugins Eclipse permettant également de centraliser le paramétrage des développeurs.

2015/02/09 20:55 · bflorat

Programming is craftsmanship and requires skills

Many managers think that programming is easy, it's just a bunch of for, if, else and switch clauses after all, isn't it ?

But coding is difficult because it is mainly about TAKING DECISIONS ALL THE TIME.

Driving is easy because you don't have to take decisions about the way to turn the steering wheel; walking is easy, you don't even have to think about it; Drilling a 10 mm hole into a wall is easy because the goal is clear and because you don't have many options to achieve it…

Software is difficult and is craftsmanship because there are always many ways to achieve the same task. Take the simplest example I can think about : an addition function : we want to add a and b to get c=a+b.

* Should I code this the object-oriented way ( a.add(b) ) or the procedural way ( add(a,b) ) ?

* How should I name this ? add() ? sum() ? How should I name the arguments ?

* How should I document the function ? is there some project conventions about it ?

* Should I return the sum or store it into the object itself ?

* Should I code this test first (TDD) ? write an UT afterwards or write no test at all ?

* Does my code scale well ? does it use a lot of memory ?

* Which visibility for this function ? private, public, package ?

* Should I handle exceptions (a is null for instance) or from the caller ?

* Should the arguments be immutable ?

* Is it thread-safe ?

* Should this function be injected within an utility class ?

* If I'm coding in object oriented, is it SOLID compliant ? what about inheritance ? …

* … tens of others questions any good coder should ask to himself

If all of this decisions could be taken by a machine, coders would not be required at all because we would just generate code (and we sometimes do it using MDD technologies, mainly for code skeletons with low added value).

We -coders- would then all be searching for a new job. But, AFAIK, this is not the case, we are still needed, still relevant. All companies still need costly software craftsmen !

Q.E.D. ;-)

I can't agree more with the manifesto for software craftsmanship.

~~DISCUSSION~~

2015/01/02 11:11 · bflorat

Undocumented PreparedStatement optimization

We just get a 20% response time gain on a 600+ lines query under Oracle. Our DBA noticed that queries were faster when launched from SQLDeveloper than from our JEE application using the JDBC Oracler 11g driver. We looked at the queries as they actually arrived to the Oracle engine and they where under the form : SELECT… WHERE col1 like ':myvar1' OR col2 LIKE ':myvar2' AND col3 IN (:item1,:myvar2,…) and not 'SELECT… WHERE col1 LIKE ':1' OR col2 LIKE ':2' AND col3 IN (:3,:4,…) like usual when using PreparedStatement the regular way.

Indeed, every PreparedStatement documentation I'm aware of, beginning with the one from Sun states that we have to use '?' to represent bind variables in queries. These '?' are replaced by ':1', ':2', '3' … by the JDBC driver. So the database has no way to now in our case that :2 and :4 have the same value. This information is lost.

We discovered that we can use PrepareStatement by providing queries with named bind variables instead of '?'. Of course, we still have to set the right value using the setXXX(int position,value) setters for every bind variable occurrence in the query. Then, queries arrive to Oracle like when using SQLDeveloper, with named bind variables.

OK but what's the deal with all this ?

I'm not sure but I think that this optimization may allow Oracle optimizer to be cleverer, especially for queries with redundant parts. It is especially good for queries with duplicated sub SELECT with IN condition containing all the same list of items. Maybe Oracle create on-the fly WITH clauses or similar optimizations in this case ?

Note that this optimization may only work with Oracle and is probably only useful for very large or redundant queries. I don't recommend it in most cases. AFAIK, neither Hibernate nor Spring-JDBC implements this optimization.

search?q=&amp;btnI=lucky

~~DISCUSSION~~

2014/12/18 20:06 · bflorat

Retour sur l'Agile Tour 2014 Nantes

J'ai eu la chance d'assister à la journée Agile Tour 2014, version nantaise, à l'école des mines. Bien organisé, riche en rencontres et retours d’expériences, comme tous les ans…

Les world cafés

Une innovation intéressante cette année : les 'World cafés' entre les conférences et pendant lesquels un sujet est discuté par un groupe éphémère et dont un seul membre (le scribe) reste pour consolider les idées qui sont ensuite présentées. Concept favorisant les échanges entre les participants. A cette occasion, j'ai notamment pu échanger avec la responsable d'une grande mutuelle qui m'expliquait qu'elle avait du mal à trouver des prestations de MCO agiles alors que de notre coté, nous avions encore du mal à trouver des clients prêts à partir en (vrai) agile en mettant en front du projet un PO (Product Owner) disposant de pouvoir de décision, d'une expertise fonctionnelle et de temps pour s'investir sur son projet.

Comment impliquer vos clients dans leurs projets ?

J'ai tout simplement adoré cette conférence très concrète et profonde à la fois. Benoit Charles-Lavauzelle (CEO de Theodo) et Julien Laure (coach agile, scrum master) présentent l'histoire de leur société et comment ils sortent des projets (maintenant) réussis en scrum. La société qui développait des projets au forfait (sites B2B en PHP/Symfony) a été proche du dépôt de bilan en 2011. L'insatisfaction des clients était forte à cause de l'effet tunnel : une fois terminées, les applications ne correspondaient pas au besoin que le client pensait avoir exprimé. La société s'est alors tourné vers la méthode scrum qu'elle a appliqué by the book. L’échec a été grand et la cause peut sembler évidente a posteriori : il n'y avait pas de PO du coté du client, donc pas d'implication. Sans PO, le projet navigue à vue. La société a décidé en 2013 ne ne plus faire que des projets en scrum avec implication forte du client. Malgré de fortes réticences des clients qui ne voulaient être facturées au temps passé et non plus au forfait, la société a vu son CA passer alors de 1.2 à 5M€ cette année. Les clients sont venus pour l’expertise technique en PHP/Symfony et sont restés pour la qualité et le respect des délais (95% des clients recommandent la société).

Comment Theodo a-t-elle réussi à impliquer le client ?
  • D'abord, rassurer le client : l'inviter aux plannings de sprint, estimer avec lui (en pocker planning) pour qu'il se rende compte des difficultés techniques. Faire des sprints courts (une semaine ici).
  • Etre transparent, Theodo suit précisement chaque écart au standard (voir support P28).
  • Burdowncharts visibles par le client en live via outils Web.

Qu'est ce qu'on bon PO ?

  • Il faut choisir le PO qui porte (vraiment) le projet, possède le pouvoir de décision (attention aux erreurs de casting).
  • Il faut du feedback permanent avec le PO : système d'évaluation hebdomadaire et portant sur la vélocité et l'accompagnement.
Comment faire valider le PO ?
  • Board électronique avec les taches à valider : Trello (très simple à utiliser pour le client).
  • e-mail quotidien en mode digest avec toutes les questions en suspens, URL importantes, n+1 en copie. Envoyé après le daily.
  • Une fiche d'auto-eval agile (voir support P44) permet d’évaluer la qualité “technique” du sprint et d'arbitrer entre le court et le long terme.
Bilan
  • Le PO travaille de un à deux jours par semaine avec l'équipe, ce n'est pas de trop !
  • Un nouveau problème émerge avec les grands comptes : la distance avec le PO et la généralisation des proxy-PO représentant du PO coté prestataire. Un proxy-PO, c'est mieux que rien (mais à peine mieux).

L'Intelligence collective au service de l’innovation et de l’industrialisation

Clément Duport (Alyotech) nous fait part de sa vision de l'innovation. Il explique que le nœud gordien des politiques IT actuelles réside en ce domaine dans l'ambivalence entre la créativité, le risque, la liberté du coté de l’innovation et l'harmonisation, le contrôle, l'ordre du coté de industrialisation. Ceci conduit à une vraie schizophrénie (OK, nous avons chez Capgemini le Lab'Innovation qui résout en partie ce dilemme en proposant cet espace d'innovation à nos clients). En fait, il explique qu'il faut les deux pour avancer, il faut trouver la bon niveau entre l'ordre (pour survivre) et le désordre (pour avancer). “Créer, c'est se souvenir de ce qui n'a pas eu lieu” (Siri Hustvedt). L’innovation peut émerger d'une démarche industrielle, par recombinaison d'idées.

Faire de la conception en équipe sans architecte

Ly-Jia Goldstein nous fait part de son expérience de développeuse en équipe suivant les préceptes du software craftmanship et de l'XP. Elle explique qu'un bon processus de développement en XP et s'appuyant sur le BDD, le tout en responsabilisant au maximum les membres de l’équipe (en instaurant des decisions techniques collégiales) pouvait se passer d'architecte (logiciel). Ceci présente de nombreux avantages comme un meilleur bus factor, une plus grande réactivité projet et une meilleur fluidité du refactoring. De bons points ont été soulevés. Néanmoins, la conférence tournait de mon point de vue autour du rôle d'architecte logiciel uniquement. Il me semble qu'un cadre d'architecture général (urbanisation, architecture technique, catalogue de solutions, cadre industrialisé, PIC) soit incontournable dans les grands SI, même s'il est vrai que les équipes, constituées en grande partie d’ingénieurs, gagneraient à être plus proactives sur un plan logiciel et éviter des situations telles que celle-ci :

search?q=&amp;btnI=lucky

2014/12/15 20:14 · bflorat

How to get bind variables values from Oracle

If you already used JDBC prepared statement, you know what are bind variables : the '?' in the query, like in : SELECT col1,col2 from t_table where col1 in (?,?,?) AND col2 = ? For the record, all compiled queries with the same number of '?' are cached by Oracle, hence (most of the time) faster to execute. But how to debug passed values ? This is often valuable like yesterday where one of our services tried to insert value too large for a column (a 4 digits integer into a NUMBER(5,2)).

There is several ways to achieve it, one is using a 'wrapper' JDBC driver (like log4jdbc) that audit and log the values but it's a bit intrusive.

A very simple non-intrusive way for a specific need is to query the v$sql table, the Oracle internal log. A sample query is given bellow (source Stack Overflow) :

select s.sql_id, 
       bc.position, 
       bc.value_string, 
       s.last_load_time, 
       bc.last_captured
from v$sql s
  left join v$sql_bind_capture bc 
         on bc.sql_id = s.sql_id 
        and bc.child_number = s.child_number
where s.sql_text like 'delete from tableA where fk%' -- or any other method to identify the SQL statement
order by s.sql_id, bc.position;

It works like a charm !

search?q=&amp;btnI=lucky

~~DISCUSSION~~

2014/12/10 22:05 · bflorat

Move to Github done smoothly

The Jajuk issue tracker and the Git repository are now moved to GitHub (see previous article for the context).

Repository move

Obviously and by nature, the Git repository move has been very simple. I just had to drop my previous origin (pointing to the gitorious project url), to add the new Github origin and to push all my branches. The push of the master branch toke around 30 mins and the others branches (develop, hotfix) almost no time at all. Note that the -u option used in the push command recreates the upstream tracking references.

git remote del origin
git remote add origin git@github.com:jajukteam/jajuk.git
git push -u origin master

The only problem occurred when dropping our Gitorious repository (error 500 → timeout?)

Issue tracker move

I tried several Trac to Github migration tools, most of them didn't work and finaly settled down with trac2github. It is written in PHP, reads the database (supports mysql, postgres and sqlite) and call the GitHub REST API V3 to create the tickets. It creates the milestones, labels, tickets and comments with good defaults. It had some bugs when working with a postgres database and I has to patch it (two of my push request has been integrated). I also pushed a patch to obfuscate emails from comments.

I also figured out another problem (not linked with the migration tool) : we used the DeleteTicket Trac plugin to drop spam tickets but GitHub issues ids have to be continuous. Origin and destination issues ids are hence now shifted, this is a problem when the code comments have references to a ticket number but we had no solutions for this problem AFAIK.

Have a look at the brand new issue tracker ! : https://github.com/jajuk-team/jajuk/issues

~~DISCUSSION~~

2014/12/09 22:21 · bflorat

BitBucket vs Github issue tracker choice for Jajuk

We are currently moving our Jajuk Trac issue tracker to a better place, mainly for spam reasons. A developer suggested BitBucket, others (me included) GitHub which I already use. I cloned our secondary project QDWizard on a private BitBucket repository to make an opinion. I have to say BitBucker is really good too.

According to me, both systems deliver the most important features :

  • Simple to import from Trac.
  • Export facilities to make change possible in the future.
  • Clean and simple GUI.
  • Clean roadmap/version support.
  • Assignation facilities.

But:

  • Github has much more users (around 4M compared to 1M for BitBucket). More developers already have accounts and are used to it.
  • GitHub GUI is a bit faster.
  • GitHub is more “open source” minded, I feel BitBucket more enterprise oriented (private repositories).
  • BitBucket is free only until 5 developers.

Specifically about issues management : the issue manager in Bitbucket is not actually Jira but a lightweight tracker. It doesn't come (hopefully) with the full workflow support. Like most tracker, each ticket has a type (a “kind” : bug, enhancement, proposal, task) , a priority (trivial,…, blocker) and a status (“workflow” : “on hold”, “resolved”, “duplicate”, “invalid”, “wontfix” and “closed”). Note that these states can't be changed nor augmented (many users asked for adding “tested” but it has never been added). It's like Trac without the possibility to customize new types and new status. Some Jajuk Trac types are not supported : “known issue”, “Limitation”, “patch”, “support request”, “to_be_reproduced” (and we map our “discussion” to BitBucket “Proposal”). Some status are missing too : “worksforme”, “not_enough_information”. I suppose a migration would have force us to map several status and several types to the same Bitbucket kind/workflow.

From its side, Github comes with (according to me) a very elegant solution : there are no tickets priorities, types or states but only “labels” like : “important”, “bug”, “wont fix” , <whatever>… OK, it may be more laxist but on the other side :

  1. it allows to add any labels to qualify a ticket against any aspect you may think about ;
  2. it doesn't force to use potentially useless fields like priority.

I suppose the migration scripts will be able to simply create any new labels to reflect our existing status and status (yet to be proven). We still have to run the migration script, I'll test this probably this week end.

2014/11/25 22:39 · bflorat
start.txt · Last modified: 2019/11/24 11:38 by admin