The specification and verification seminar happens on Thursdays at 3.30pm, so far at least once every two weeks. We usually meet at the seminar room, B107.
In this page you will find the abstracts and seminar materials of previous sessions. Anounces of forthcoming seminars are made on this page.
Contact: César Rodríguez.
CARET (A temporal logic of calls and returns) was introduced by Alur et al. This logic allows to write linear temporal logic formulas while taking into account matching of calls and returns. However, CARET model checking for Pushdown Systems (PDSs) was never considered in the literature. Previous works only dealt with the model checking problem for Recursive State Machine (RSMs). While RSMs are a good formalism to model sequential programs written in structured programming languages like C or Java, they become non suitable for modeling binary or assembly programs, since, in these programs, explicit push and pop of the stack can occur. Thus, it is very important to have a CARET model checking algorithm for PDSs. We tackle this problem in this paper. We also consider CARET model checking with regular valuations, where the set of configurations in which an atomic proposition holds is a regular language. We reduce these problems to the emptiness problem of Büchi Pushdown Systems. We implemented our technique in a tool, and we applied it to different case studies. Our results are encouraging. In particular, we were able to apply our tool to detect several malwares.
This is a joint work with Tayssir Touili.
Parameter synthesis for timed systems aims at deriving parameter valuations satisfying a given property. In this paper we target concurrent systems; it is well known that concurrency is a source of state-space explosion, and partial order techniques were defined to cope with this problem. Here we use partial order semantics for parametric time Petri nets as a way to significantly enhance the result of an existing synthesis algorithm. Given a reference parameter valuation, our approach synthesizes other valuations preserving, up to interleaving, the behavior of the reference parameter valuation. We show the applicability of our approach using acyclic asynchronous circuits.
Join work with Thomas Chatain and Étienne André.
Biological Regulatory Networks (BRNs) are usually used in systems biology for modelling, understanding and controlling the dynamics of different biological functions (differentiation, proliferation, proteins synthesis, apoptose) inside cells. Those networks are enhanced with experimental data that are nowadays more available which give an idea on the dynamics of BRNs components. Formal analysis of such models fails in front of the combinatorial explosion of generated behaviours despite the fact that BRNs provide abstract representation of biological systems.
This thesis handles hybrid modelling, the simulation, the formal verification and control of Large Biological Regulatory Networks. This modelling is done thanks to stochastic automata networks, thereafter to Process Hitting by integrating time-series data.
Firstly, this thesis proposes a refining of the dynamics by estimation of stochastic and temporal (delay) parameters from time-series data and integration of those parameters in automata networks models. This integration allows the parametrisation of the transitions between the states of the system. Then, a statistical analysis of the traces of the stochastic simulation is proposed to compare the dynamics of simulations with the experimental data.
Secondly, this thesis develops static analysis by abstract interpretation in the automata networks allowing efficient under- and over-approximation of quantitative (probability and delay) reachability properties. This analysis enables to highlight the critical components to satisfy these properties.
Finally, taking advantage from the previous developed static analyses for the reachability properties in the qualitative point of view, and from the power of logic programming (Answer Set Programming), this thesis addresses the domain of control of system by proposing the identification of bifurcation transitions. Bifurcations are transitions after which the system can no longer reach a state that was previously reachable.
Pushdown systems (PDSs) are a natural model for sequential programs, but they can fail to accurately represent the way an assembly stack actually operates. Indeed, one may want to access the part of the memory that is below the current stack or base pointer, hence the need for a model that keeps track of this part of the memory. To this end, we introduce pushdown systems with an upper stack (UPDSs), an extension of PDSs where symbols popped from the stack are not destroyed but instead remain just above its top, and may be overwritten by later push rules.
We prove that the sets of successors post* and predecessors pre* of a regular set of configurations of such a system are not always regular, but that post* is context-sensitive, so that we can decide whether a single configuration is forward reachable or not. In order to underapproximate pre* in a regular fashion, we consider a bounded-phase analysis of UPDSs, where a phase is a part of a run during which either push or pop rules are forbidden. We then present a method to overapproximate post* that relies on regular abstractions of runs of UPDSs. Finally, we show how these approximations can be used to detect stack overflows and stack pointer manipulations with malicious intent.
This is a joint work with Marcio Diaz And Tayssir Touili.
Malware detection is nowadays a big challenge. The existing techniques for malware detection require a huge effort of engineering to manually extract the malicious behaviors. To avoid this tedious task of manually discovering malicious behaviors, we propose in this paper to apply learning for malware detection.
Given a set of malwares and a set of benign programs, we show how learning techniques can be applied in order to detect malware. For that, we use abstract API graphs to represent programs. Abstract API graphs are graphs whose nodes are API functions and whose edges represent the order of execution of the different calls to the API functions (i.e., functions supported by the operating system). To learn malware, we apply well-known learning techniques based on Random Walk Graph Kernel (combined with Support Vector Machines). We can achieve a high detection rate with only few false alarms (98.93% for detection rate with 1.24% of false alarms). Moreover, we show that our techniques are able to detect several malwares that could not be detected by well-known and widely used antiviruses such as Avira, Kaspersky, Avast, Qihoo-360, McAfee, AVG, BitDefender, ESET-NOD32, F-Secure, Symantec or Panda.
This is a joint work with Tayssir Touili.
The number of unique malware has been doubling every year for over two decades. The majority of effort in malware analysis has focused on methods for preventing malware infection. We view the exponential growth of malware as an underutilized source of intelligence. Given that the number of malware authors are not doubling each year, the large volume of malware must contain evidence that connects them. The challenge is how to extract the connections and do so in a timely manner to be useful.
This talk will describe Cythereal MAGIC, a cloud-based malware analysis service, for mining large scale malware repositories. MAGIC, an offshoot of research conducted under the DARPA Cyber Genome program, uses malware "genome" to construct features that are resilient to many obfuscations. The correlations discovered by MAGIC can be used to prevent and detect zero-day malware. The correlations can also be used to investigate malware attacks for attribution.
Automatically verifying safety properties of programs is hard. Many approaches exist for verifying programs operating on Boolean and integer values (e.g. abstract interpretation, counterexample-guided abstraction refinement using interpolants), but transposing them to array properties has been fraught with difficulties. Our work addresses that issue with a powerful and flexible abstraction that morphes concrete array cells into a finite set of abstract ones. This abstraction is parametric both in precision and in the back-end analysis used.
One possible application would be distributed systems, where processes are modeled using arrays indexed by the process ID.
Nous présentons dans ce travail de thèse des approches pour la spécification et la vérification des systèmes temporisés. La première partie concerne une méthode de spécification en utilisant les diagrammes états-transitions pour modéliser un système donné en partant d'une description textuelle. Elle comporte plusieurs étapes et utilise des observateurs d'états et des événements afin d'engendrer le diagramme états-transitions. Un outil qui implémente les différentes étapes de la méthode de spécification pour une application semi-automatique est présenté. La seconde partie concerne une traduction des diagrammes états-transitions vers les réseaux de Petri colorés, ce qui permet d'utiliser les méthodes de vérification. Nous prenons en considération dans cette traduction un ensemble important des éléments syntaxiques des diagrammes états-transitions, tels que la concurrence, la hiérarchie, etc. Un outil qui implémente la traduction pour un passage automatique des diagrammes états-transitions vers les réseaux de Petri colorés est en cours de développement. La dernière partie concerne l'intégration des contraintes temporelles dans les deux approches précédentes. Nous définissons des annotations pour les diagrammes états-transitions dont nous fournissons la syntaxe et la sémantique. Ces annotations seront ensuite utilisées dans la méthode de spécification et la traduction. Le but est de proposer des annotations faciles à comprendre et à utiliser avec une syntaxe qui prend en compte des contraintes parmi les plus utilisées.
Mots clés : modélisation, diagrammes états-transitions UML, formalisation, sémantique formelle, réseaux de Petri colorés, contraintes temporelles.
Ensuring the correctness of critical real-time systems, involving concurrent behaviors and timing requirements, is crucial. Parameter synthesis aims at computing dense sets of valuations for the timing requirements, guaranteeing a good behavior. However, in most cases, the emptiness problem for reachability (i.e. whether there exists at least one parameter valuation for which some state is reachable) is undecidable and, as a consequence, synthesis procedures do not terminate in general, even for bounded parameters. In this paper, we introduce a parametric extrapolation, that allows us to derive an underapproximation in the form of linear constraints containing all the integer points ensuring reachability or unavoidability, and all the (non-necessarily integer) convex combinations of these integer points, for general PTA with a bounded parameter domain. Our algorithms terminate and can output constraints arbitrarily close to the complete result.
Joint work with Didier Lime and Olivier H. Roux.
The number of new malwares is increasing everyday. Thus malware detection is nowadays a big challenge. The existing techniques for malware detection require a huge effort of engineering to manually extract the malicious behaviors. To avoid this tedious task, we propose in this paper an approach to automatically extract the malicious behaviors. We model a program using an API call graph, and we represent the malicious behaviors using a malicious API graph. We then reduce the malicious behavior extraction problem to the problem of retrieving from the benign and malicious API call graphs the set of subgraphs that are relevant for malicious behaviors. We solve this issue by applying and adapting well-known efficient Information Retrieval techniques based on the TFIDF scheme. We use our automatically extracted malicious behavior specification for malware detection using a kind of product between graphs. We obtained interesting experimental results, as we get 99.04% of detection rate. Moreover, we were able to detect several malwares that well-known and widely used antiviruses such as Panda, Avira, Kaspersky, Avast, Qihoo- 360, McAfee, AVG, BitDefender, ESET-NOD32, F-Secure, and Symantec could not detect.
This is a joint work with Tayssir Touili.
Interval Markov Chains (IMCs) are the base of a classic probabilistic specification theory introduced by Larsen and Jonsson in 1991. They are also a popular abstraction for probabilistic systems. In this paper we study parameter synthesis for a parametric extension of Interval Markov Chains in which the endpoints of intervals may be replaced with parameters. In particular, we propose constructions for the synthesis of all parameter values ensuring several properties such as consistency and consistent reachability in both the existential and universal settings with respect to implementations. We also discuss how our constructions can be modified in order to synthesise all parameter values ensuring other typical properties.
Joint work with Benoît Delahaye and Didier Lime.
Partial order reduction (POR) and net unfoldings are two alternative methods to tackle state-space explosion caused by concurrency. In this paper, we propose the combination of both approaches in an effort to combine their strengths. We first define, for an abstract execution model, unfolding semantics parameterized over an arbitrary independence relation. Based on it, our main contribution is a novel stateless POR algorithm that explores at most one execution per Mazurkiewicz trace, and in general, can explore exponentially fewer, thus achieving a form of emph{super-optimality}. Furthermore, our unfolding-based POR copes with non-terminating executions and incorporates state-caching. On benchmarks with busy-waits, among others, our experiments show a dramatic reduction in the number of executions when compared to a state-of-the-art DPOR.
Joint work with Marcelo Sousa, Subodh Sharma, and Daniel Kroening. Paper available at: http://arxiv.org/abs/1507.00980
Interrupt-driven software is difficult to test and debug, especially when interrupts can be nested and subject to priorities. Interrupts can arrive at arbitrary times, leading to an explosion in the number of cases to be considered. We present a new formal approach to verifying interrupt-driven software based on symbolic execution. The approach leverages recent advances in the encoding of the execution traces of interacting, concurrent threads. We assess the performance of our method on benchmarks drawn from embedded systems code and device drivers, and experimentally compare it to conventional formal approaches that use source-to-source transformations. Our experimental results show that our method significantly outperforms conventional techniques. To the best of our knowledge, our technique is the first to demonstrate effective formal verification of low-level embedded software with nested interrupts.
Joint work with Daniel Kroening, Tom Melham, Peter Schrammel and Michael Tautschnig
Partial orders are a fundamental mathematical structure capable of representing concurrency and causality on a set of atomic events. In many applications it is essential to consider multiple partial orders, each representing a particular behavioral scenario or an operating mode of a modeled system.
In this talk I will present two mathematical formalisms capable of the compressed representation of sets of partial orders: Labeled Event Structures (LESs) and Conditional Partial Order Graphs (CPOGs). I will demonstrate their advantages and disadvantages and propose efficient algorithms for transforming of a set of partial orders from a given compressed representation in one formalism into an equivalent representation in another formalism without the explicit enumeration of each scenario. These algorithms make use of an intermediate mathematical formalism, which we call Conditional Labeled Event Structures (CLESs), that combines the advantages of LESs and CPOGs.
This is joint work with Andrey Mokhov (Newcastle University)
UML is the de facto standard for modelling concurrent systems in the industry. Activity diagrams allow designers to model workflows or business processes. Unfortunately, their informal semantics prevents the use of automated verification techniques. In this paper, we first propose activity diagram patterns for modelling timed concurrent systems; we then devise a modular mechanism to compose timed diagram fragments into a UML activity diagram that also allows for refinement, and we formalise the semantics of our patterns using time Petri nets. Our approach guides the modeller task% (helping to avoid common mistakes), and allows for automated verification.
Joint work with Christine Choppy and Thierry Noulamo
Qualitative formal verification aims at checking that a given formal property holds on a model, given as an automaton. The automata-based approach expresses the property as an automaton then analyses its synchronisation with the model automaton. This method can be extended with various automata flavors to handle quantitative properties.
We focus on an extension of Linear Temporal Logic (LTL) with counting capabilities: Cost Linear Temporal Logic (CLTL). This logics can be translated into Büchi automata equipped with counters, so as to nicely extend the automata approach to model-checking. We describe the new properties that this extension allows to handle, illustrated with examples. We also explain how it is linked to existing qualitative LTL model-checking, and the new challenges it poses. We also propose a CEGAR-like approach to answer compute the bounds of the values reached by the automata.
Nowadays, students are more and more demanding for practical coursework, which is a challenge when teaching formal approaches to software engineering. The solution is to provide environments for such hands-on sessions and homework, but this raises numerous difficulties. The environment must be: (i) multi-platform (Mac OS, Linux, Windows) so as to enable student practice at home, (ii) easy to deploy, (iii) easy to use and to take charge of, and (iv) flexible enough to enable the integration of new notations and associated services.
CosyVerif is a software environment dedicated to graphical notations, that provides the mechanisms and means for an easy integration of additional existing software for teaching (or demonstration) purposes. This makes it an interesting platform to establish new courses.
This paper presents our experience using CosyVerif for teaching Petri nets and parametric timed automata in two universities of the Paris region, i.e. Université Pierre et Marie Curie, and Université Paris 13. We also use CosyVerif to build demonstrators of Ph.D. students' work.
Over the past two decades, numerous verification tools have been successfully used for verifying complex concurrent systems, modelled using various formalisms. However, it is still hard to coordinate these tools since they rely on such a large number of formalisms. Having a proper syntactical mechanism to interrelate them through variability would increase the capability of effective integrated formal methods. Here, we propose a modular approach for defining new formalisms by reusing existing ones and adding new features and/or constraints. Our approach relies on standard XML technologies; their use provides the capability of rapidly and automatically obtaining tools for representing and validating models. It thus enables fast iterations in developing and testing complex formalisms. As a case study, we applied our modular definition approach on families of Petri nets and timed automata. This work is implemented in the CosyVerif platform, a modular framework integrating verification software tools of the Paris region.
This is a joint work with the CosyVerif team.
Quantifying the robustness of a real-time system consists in measuring the maximum extension of the timing delays such that the system still satisfies its specification. In this work, we introduce a more precise notion of robustness, measuring the allowed variability of the timing delays in their neighbourhood. We consider here the formalism of time Petri nets extended with inhibitor arcs. We use the inverse method, initially defined for timed automata. Its output, in the form of a parametric linear constraint relating all timing delays, allows the designer to characterise the system local robustness, and hence to identify the delays allowing the least variability. We also exhibit a condition and a construction for rendering robust a non-robust system.
This work is a joint work with Laure Petrucci.
In model checking, a well known source of state-space explosion (SSE) is the explicit representation of concurrent actions by their interleavings. Partial-order reductions (PORs) are a family of techniques attempting to cope with this by constructing an equivalent state space where irrelevant executions of the original are discarded.
This talk will be a gentle introduction to the topic. We will focus on Godefroid's persistent sets, prove that a selective exploration based on them visits all deadlocks, discuss the two main classes of algorithms for computing them, and finish, time permitting, with an overview of the conceptual similarities and differences between PORs and the unfolding technique.
In model checking, a well known source of state-space explosion (SSE) is the explicit representation of concurrent actions by their interleavings. Partial-order reductions attempt to address this by constructing an equivalent state space where irrelevant executions of the original are discarded. A comparatively less well-known approach emerges from the use of partial-order semantics, where the state space is instead represented by a partial order where concurrent actions are simply left unordered. Petri net unfoldings are arguably the most prominent verification technique issued from this idea.
In this talk, three different semantics for Petri nets will be presented, the last one of which will be the aforementioned unfolding semantics. We will then see how unfoldings can be employed in practical verification and, time permitting, how to improve the method to address additional sources of SSE.
Last modified on Fri 28 Jan 23:48:57 GMT 2022