The website of the Theory of Complex Systems group has moved.
Please use the following link if you are not automatically redirected:

http://www.tks.cs.uni-frankfurt.de/schweika

Theorie komplexer Systeme
Prof. Dr. Nicole Schweikardt
Institut für Informatik
Goethe-Universität
Frankfurt am Main

Abstracts of Nicole Schweikardt's Publications

Contributions to Books   -   Journal Papers   -   Conference Papers   -   Miscellaneous   -    Theses

Contributions to Books   (in chronological order) :

[B1]
Die Ausdrucksstärke der Logik erster Stufe mit eingebauten Prädikaten. Nicole Schweikardt. In Ausgezeichnete Informatikdissertationen 2002, Dorothea Wagner et al., editors, volume D-3 of GI-Edition Lecture Notes in Informatics (LNI) - Dissertations, pages 153-162, Gesellschaft für Informatik, 2003. ISBN 3-88579-407-1.
Download: pdf (132 kB).

Abstract (in german):
Meine Dissertation beschäftigt sich mit der Ausdrucksstärke der Logik erster Stufe auf Strukturen mit eingebauten Prädikaten wie z.B. lineare Ordnung, Addition und Multiplikation. Die Hauptergebnisse lassen sich den drei Teilbereichen "Arithmetik und Zählquantoren", "die Crane Beach-Vermutung" und "Kollaps-Resultate in der Datenbanktheorie" zuordnen. Ziel des hier vorliegenden Artikels ist, einen Einblick in die Fragestellungen und Ergebnisse zu diesen drei Themenkreisen zu geben.


[B2]
Expressive power of monadic logics on words, trees, pictures, and graphs. Oliver Matz and Nicole Schweikardt. Logic and Automata: History and Perspectives. J. Flum, E. Grädel, and T. Wilke (editors). Texts in Logic and Games, Amsterdam University Press, pages 531-552, 2007.
Download: pdf (211 kB)

Abstract:
We give a survey of the expressive power of various monadic logics on specific classes of finite labeled graphs, including words, trees, and pictures. Among the logics we consider, there are monadic second-order logic and its existential fragment, the modal mu-calculus, and monadic least fixed-point logic. We focus on nesting-depth and quantifier alternation as a complexity measure of these logics.


[B3]
Short-Entries on Ehrenfeucht-Fraïssé Games, One-Pass Algorithms, and Zero-One Laws. Nicole Schweikardt. In Encyclopedia of Database Systems, Ling Liu and M. Tamer Özsu (editors), Springer-Verlag, 2009, pages 963-964, 1948-1949, and 3683.
Download: Ehrenfeucht-Fraïssé Games (pdf) (50 kB), One-Pass Algorithms (pdf) (42 kB), Zero-One Laws (pdf) (45 kB).

Abstract:
These short-entries give brief descriptions of the notions of Ehrenfeucht-Fraïssé Games, and One-Pass Algorithms, Zero-One Laws.


[B4]
Database Theory: Query Languages. Nicole Schweikardt, Thomas Schwentick, and Luc Segoufin. Chapter 19 in Algorithms and Theory of Computation Handbook, Second Edition, Volume 2: Special Topics and Techniques, Mikhail J. Atallah and Marina Blanton (editors), CRC Press, to appear in Nov. 2009.

Abstract:
This chapter gives an introduction to the theoretical foundations of query languages for relational databases. It thus addresses a significant part of database theory. Special emphasis is put on the expressive power of query languages and the computational complexity of their associated evaluation and static analysis problems.


Journal Papers   (in chronological order) :

[J1]
The monadic quantifier alternation hierarchy over grids and graphs. Oliver Matz, Nicole Schweikardt, and Wolfgang Thomas. Information and Computation, volume 179, number 2, 2002, pages 356-383. Special issue for selected papers from LICS'97.
Download: ps.gz (208 kB), ps (912 kB), pdf (553 kB).

Abstract:
The monadic second-order quantifier alternation hierarchy over the class of finite graphs is shown to be strict. The proof is based on automata theoretic ideas and starts from a restricted class of graph-like structures, namely finite two-dimensional grids. Considering grids where the width is a function of the height, we prove that the difference between the levels k and k+1 of the monadic hierarchy is witnessed by a set of grids where this function is (k+1)-fold exponential. We then transfer the hierarchy result to the class of directed (or undirected) graphs, using an encoding technique called "strong reduction".
It is notable that one can obtain sets of graphs which occur arbitrarily high in the monadic hierarchy but are already definable in the first-order closure of existential monadic second-order logic. We also verify that these graph properties even belong to the complexity class NLOGSPACE, which indicates a profound difference between the monadic hierarchy and the polynomial time hierarchy.


[J2]
Comparing the succinctness of monadic query languages over finite trees. Martin Grohe and Nicole Schweikardt. RAIRO - Theoretical Informatics and Applications (ITA), volume 38, pages 343-373, 2004.
Download: ps.gz (186 kB), ps (461 kB), pdf (262 kB).

Abstract:
This is the journal version of [C6].
We study the succinctness of monadic second-order logic and a variety of monadic fixed point logics on trees. All these languages are known to have the same expressive power on trees, but some can express the same queries much more succinctly than others. For example, we show that, under some complexity theoretic assumption, monadic second-order logic is non-elementarily more succinct than monadic least fixed point logic, which in turn is non-elementarily more succinct than monadic datalog.
Succinctness of the languages is closely related to the combined and parameterized complexity of query evaluation for these languages.


[J3]
Logik und Informatik. Stephan Kreutzer und Nicole Schweikardt. it - Information Technology, Vol. 46, No. 3, 2004, pages 162-166.
Download: pdf (159 kB).

Abstract (in german):
Als eine ihrer wichtigsten theoretischen Grundlagen hat die Logik Bedeutung für fast alle Bereiche der Informatik. Wir wollen diese hier anhand der Bereiche Datenbanken, automatische Verifikation und Komplexitätstheorie erläutern und dabei die Fragestellungen und Ergebnisse unserer Dissertationen vorstellen.


[J4]
Arithmetic, First-Order Logic, and Counting Quantifiers. Nicole Schweikardt. ACM Transactions on Computational Logic (ACM ToCL), volume 6, issue 3, July 2005, pages 634-671.
Download: ps.gz (263 kB), ps (698 kB), pdf (384 kB).

Abstract:
This paper gives a thorough overview of what is known about first-order logic with counting quantifiers and with arithmetic predicates. As a main theorem we show that Presburger arithmetic is closed under unary counting quantifiers. Precisely, this means that for every first-order formula phi(y,z_1,...,z_k) over the signature {<,+} there is a first-order formula psi(x,z_1,...,z_k) which expresses over the structure (Nat,<,+) (respectively, over initial segments of this structure) that the variable x is interpreted exactly by the number of possible interpretations of the variable y for which the formula phi(y,z_1,...,z_k) is satisfied. Applying this theorem, we obtain an easy proof of Ruhl's result that reachability (and similarly, connectivity) in finite graphs is not expressible in first-order logic with unary counting quantifiers and addition. Furthermore, the above result on Presburger arithmetic helps to show the failure of a particular version of the Crane Beach conjecture.


[J5]
First-Order Expressibility of Languages with Neutral Letters Or: The Crane Beach Conjecture. David A. Mix Barrington, Neil Immerman, Clemens Lautemann, Nicole Schweikardt and Denis Thérien. Journal of Computer and System Sciences, volume 70, pages 101-127, 2005.
Download: ps.gz (260 kB), ps (626 kB), pdf (298 kB).

Abstract:
This is the journal version of [C4].
A language L over an alphabet A is said to have a neutral letter if there is a letter e in A such that inserting or deleting e's from any word in A* does not change its membership or non-membership in L.
The presence of a neutral letter affects the definability of a language in first-order logic. It was conjectured that it renders all numerical predicates apart from the order predicate useless, i.e., that if a language L with a neutral letter is not definable in first-order logic with linear order, then it is not definable in first-order logic with any set N of numerical predicates. Named after the location of its first, flawed, proof this conjecture is called the Crane Beach conjecture (CBC, for short). The CBC is closely related to uniformity conditions in circuit complexity theory and to collapse results in database theory.
We investigate the CBC in detail, showing that it fails for N = {+,*}, or, possibly stronger, for any set N that allows counting up to the m times iterated logarithm, for any constant m. On the positive side, we prove the conjecture for the case of all monadic numerical predicates, for the addition predicate +, for the fragment BC(Sigma_1) of first-order logic, for regular languages, and for languages over a binary alphabet. We explain the precise relation between the CBC and so-called natural-generic collapse results in database theory. Furthermore, we introduce a framework that gives better understanding of what exactly may cause a failure of the conjecture.


[J6]
The succinctness of first-order logic on linear orders. Martin Grohe and Nicole Schweikardt. Logical Methods in Computer Science, volume 1 (1:6) 2005, pages 1-25.
Download here from http://www.lmcs-online.org/.

Abstract:
This is the journal version of [C7].
Succinctness is a natural measure for comparing the strength of different logics. Intuitively, a logic L is more succinct than another logic L' if all properties that can be expressed in L' can be expressed in L by formulas of (approximately) the same size, but some properties can be expressed in L by (significantly) smaller formulas. We study the succinctness of logics on linear orders that have the same expressive power as first-order logic. Our first theorem is concerned with the finite variable fragments of first-order logic. We prove that:
  1. Up to a polynomial factor, the 2- and the 3-variable fragments of first-order logic on linear orders have the same succinctness.
  2. The 4-variable fragment is exponentially more succinct than the 3-variable fragment.
Our second main result compares the succinctness of first-order logic on linear orders with that of monadic second-order logic. We prove that the fragment of monadic second-order logic that has the same expressiveness as first-order logic on linear orders is non-elementarily more succinct than first-order logic.


[J7]
On the expressive power of monadic least fixed point logic. Nicole Schweikardt. Theoretical Computer Science, volume 350, pages 325-344, 2006. Special issue for selected papers from ICALP'04, Track B.
Download: ps.gz (164 kB), ps (404 kB), pdf (242 kB).

Abstract:
This is the journal version of [C8].
Monadic least fixed point logic MLFP is a natural logic whose expressiveness lies between that of first-order logic FO and monadic second-order logic MSO. In this paper we take a closer look at the expressive power of MLFP. Our results are:
  1. MLFP can describe graph properties beyond any fixed level of the monadic second-order quantifier alternation hierarchy.
  2. On strings with built-in addition, MLFP can describe at least all languages that belong to the linear time complexity class DLIN.
  3. Settling the question whether addition-invariant MLFP = addition-invariant MSO on finite strings would solve open problems in complexity theory: "=" would imply that PH = PTIME whereas "≠" would imply that DLIN ≠ LinH.


[J8]
An Ehrenfeucht-Fraïssé Game Approach to Collapse Results in Database Theory. Nicole Schweikardt. Information and Computation, volume 205, issue 3, pages 311-379, 2007.
Download: ps.gz (379 kB), ps (1107 kB), pdf (697 kB).

Abstract:
Pursuing an Ehrenfeucht-Fraïssé game approach to collapse results in database theory, we show that, in principle, every natural generic collapse result may be proved via a translation of winning strategies for the duplicator in an Ehrenfeucht-Fraïssé game. Following this approach we can deal with certain infinite databases where previous, highly involved methods fail. We prove the natural generic collapse for Z-embeddable databases over any linearly ordered context structure with arbitrary monadic predicates, and for N-embeddable databases over the context structure (R,<,+,Mon_Q,Groups). Here, N, Z, and R, denote the sets of natural numbers, integers, and real numbers, respectively; Groups is the collection of all subgroups of (R,+) that contain Z, and Mon_Q is the collection of all subsets of a particular infinite subset Q of N. This, in particular, implies the collapse for arbitrary databases over (N,<,+,Mon_Q) and for N-embeddable databases over (R,<,+,Z,Q).
Restricting the complexity of the formulas that may be used to formulate queries to Boolean combinations of purely existential first-order formulas, we even obtain the collapse for N-embeddable databases over any linearly ordered context structure with arbitrary predicates.
Finally, we develop the notion of N-representable databases, which is a natural generalization of the classical notion of finitely representable databases. We show that natural generic collapse results for N-embeddable databases can be lifted to the larger class of N-representable databases.
To obtain, in particular, the collapse result for (N,<,+,Mon_Q), we explicitly construct a winning strategy for the duplicator in the presence of the built-in addition relation +. This, as a side product, also leads to an Ehrenfeucht-Fraïssé game proof of the theorem of Ginsburg and Spanier, stating that the spectra of FO(<,+)-sentences are semi-linear.


[J9]
Tight Lower Bounds for Query Processing on Streaming and External Memory Data. Martin Grohe, Christoph Koch, and Nicole Schweikardt. Theoretical Computer Science, volume 380(1-2), pages 199-217, 2007. Special issue for selected papers from ICALP'05.
Download: pdf (314 kB).

Abstract:
This is the journal version of [C11].
It is generally assumed that databases have to reside in external, inexpensive storage because of their sheer size. Current technology for external storage systems presents us with a reality that performance-wise, a small number of sequential scans of the data is strictly preferable over random data accesses. Database technology — in particular query processing technology — has developed around a notion of memory hierarchies with layers of greatly varying sizes and access times. It seems that the current technologies scale up to their tasks and are very successful, but on closer investigation it may appear that our theoretical understanding of the problems involved — and of optimal algorithms for these problems — is not quite as developed.
Recently, data stream processing has become an object of study by the database management community, but from the viewpoint of database theory, this is really a special case of the query processing problem on data in external storage where we are limited to a single scan of the input data.
In the present paper we study a clean machine model for external memory and stream processing. We establish tight bounds for the data complexity of Core XPath evaluation and filtering. We show that the number of scans of the external data induces a strict hierarchy (as long as internal memory space is sufficiently small, e.g., polylogarithmic in the size of the input). We also show that neither joins nor sorting are feasible if the product of the number r(n) of scans of the external memory and the size s(n) of the internal memory buffers is sufficiently small, i.e., of size o(n).


[J10]
Reversal Complexity Revisited. André Hernich and Nicole Schweikardt. Theoretical Computer Science, volume 401(1-3), pages 191-205, 2008.
Abstract: here.   Download: pdf (295 kB).

Abstract:
We study a generalized version of reversal bounded Turing machines where, apart from several tapes on which the number of head reversals is bounded by r(n), there are several further tapes on which head reversals remain unrestricted, but size is bounded by s(n) (where n denotes the input length). Recently, such machines were introduced as a formalization of a computation model that restricts random access to external memory and internal memory space. Here, each of the tapes with a restriction on the head reversals corresponds to an external memory device, and the tapes of restricted size model internal memory. We use ST(r(n),s(n),O(1)) to denote the class of all problems that can be solved by deterministic Turing machines that comply to the above resource bounds. Similarly, NST(...) and RST(...), respectively, are used for the corresponding nondeterministic and randomized classes.
While previous papers focused on lower bounds for particular problems, including sorting, the set equality problem, and several query evaluation problems, the present paper addresses the relations between the (R,N)ST(...)-classes and classical complexity classes and investigates the structural complexity of the (R,N)ST(...)-classes. Our main results are
  1. a trade-off between internal memory space and external memory head reversals,
  2. correspondences between the (R,N)ST(...) classes and "classical" time-bounded, space-bounded, reversal-bounded, and circuit complexity classes, and
  3. hierarchies of (R)ST(...)-classes in terms of increasing numbers of head reversals on external memory tapes.


[J11]
Database Query Processing Using Finite Cursor Machines. Martin Grohe, Yuri Gurevich, Dirk Leinders, Nicole Schweikardt, Jerzy Tyszkiewicz, Jan Van den Bussche. Theory of Computing Systems, volume 44(4), pages 533-560, 2009. Special issue for selected papers from ICDT'07.
Download: pdf (315 kB, preliminary version) or here (link to the journal).

Abstract:
This is the journal version of [C17].
We introduce a new abstract model of database query processing, finite cursor machines, that incorporates certain data streaming aspects. The model describes quite faithfully what happens in so-called "one-pass" and "two-pass query processing". Technically, the model is described in the framework of abstract state machines. Our main results are upper and lower bounds for processing relational algebra queries in this model, specifically, queries of the semijoin fragment of the relational algebra.


[J12]
Lower bounds for processing data with few random accesses to external memory. Martin Grohe, André Hernich, Nicole Schweikardt. Journal of the ACM, volume 56(3), article 12, pages 1-58, 2009. Special issue for selected papers from PODS'06.
Download: pdf (852 kB).

Abstract:
This is the combined journal version of [C12] and [C15].
We consider a scenario where we want to query a large dataset that is stored in external memory and does not fit into main memory. The most constrained resources in such a situation are the size of the main memory and the number of random accesses to external memory. We note that sequentially streaming data from external memory through main memory is much less prohibitive.
We propose an abstract model of this scenario in which we restrict the size of the main memory and the number of random accesses to external memory, but admit arbitrary sequential access. A distinguishing feature of our model is that it allows the usage of unlimited external memory for storing intermediate results, such as several hard disks that can be accessed in parallel.
In this model, we prove lower bounds for the problem of sorting a sequence of strings (or numbers), the problem of deciding whether two given sets of strings are equal, and two closely related decision problems. Intuitively, our results say that there is no algorithm for the problems that uses internal memory space bounded by N^{1-epsilon} and at most o(log N) random accesses to external memory, but unlimited ``streaming access'', both for writing to and reading from external memory. (Here N denotes the size of the input and epsilon is an arbitrary constant greater than 0.) We even permit randomized algorithms with one-sided bounded error. We also consider the problem of evaluating database queries and prove similar lower bounds for evaluating relational algebra queries against relational databases and XQuery and XPath queries against XML-databases.


[J13]
Logic and Data Exchange: Which Solutions are "Good" Solutions?. André Hernich and Nicole Schweikardt. Logic and the Foundations of Game and Decision Theory (LOFT 8). G. Bonanno, B. Löwe, and W. van der Hoek (editors). Texts in Logic and Games, Amsterdam University Press. Special issue for selected papers from LOFT'08. To appear.
Download: pdf (390 kB)

Abstract:
This paper gives an introduction into the area of data exchange, with a special emphasis on the question of which solutions can be considered "good" solutions. We will concentrate on notions of "good" solutions for query answering, in particular, universal solutions, the core of the universal solutions, and CWA-solutions.


[J14]
Machine Models for Query Processing. Nicole Schweikardt. SIGMOD Record, volume 38(2), June 2009.
Download: pdf (237 kB)

Abstract:
In recent years, a number of machine models have been developed that take into account the existence of multiple storage media of varying sizes and access characteristics. These models are particularly useful for studying the complexity of query evaluation on massive data sets. The present paper gives an overview of such machine models. The models considered in this survey are the external memory model, the read/write streams (a model for sequential external memory processing), the finite cursor machines (a model for relational database query processing), the mpms-automata (a model for processing indexed XML files), the data stream model (a model for processing data on-the-fly), and the mud model (a model for massive, unordered, distributed computations).


Conference Papers   (in chronological order) :

[C1]
The monadic quantifier alternation hierarchy over grids and pictures. Nicole Schweikardt. CSL'97: 6th Annual Conference of the European Association for Computer Science Logic, volume 1414 of Lecture Notes in Computer Science, pages 441-460, Aarhus, Denmark, August 1997. Springer. Selected Papers.
Download: ps.gz (155 kB), ps (577 kB), pdf (288 kB).
Full version: Preliminary Proceedings of CSL'97, BRICS Notes Series NS-97-1, ISSN 0909-3206, pages 383-398.
Download: ps.gz (130 kB), ps (603 kB), pdf (308 kB).

Abstract:
The subject of this paper is the expressive power of monadic second-order logic over two-dimensional grids. We show the strictness of the monadic quantifier alternation hierarchy over the class of two-dimensional (uncolored) grids. Furthermore, we give a new, self-contained game-theoretical proof of the nonexpressibility results of Matz and Thomas. Additionally, we present an alternate and easy proof of a result of Ajtai, Fagin, and Stockmeyer, stating that monadic second-order logic allows to define complete problems of each level of the polynomial time hierarchy.


[C2]
A logical characterisation of linear time on nondeterministic Turing machines. Clemens Lautemann, Nicole Schweikardt, and Thomas Schwentick. STACS'99: 16th Annual Symposium on Theoretical Aspects of Computer Science, volume 1563 of Lecture Notes in Computer Science, pages 143-152, Trier, Germany, March 1999. Springer.
Download: ps.gz (98 kB), ps (437 kB), pdf (210 kB).
Full version: ps.gz (126 kB), ps (538 kB), pdf (296 kB).

Abstract:
The paper gives a logical characterisation of the class NTIME(n) of problems that can be solved on a nondeterministic Turing machine in linear time. It is shown that a set L of strings belongs to this class if and only if there is a formula of the form $\exists f_1 ... \exists f_k \exists R_1 ... \exists R_m \forall x \phi$ that is true exactly for all strings in L. In this formula the f_j are unary function symbols, the R_j are unary relation symbols, and \phi is a quantifier-free formula. Furthermore, the quantification of functions is restricted to non-crossing, decreasing functions, and in \phi no equations in which different functions occur are allowed. There are a number of variations of this statement, e.g., it holds also for k=3. From these results we also derive an Ehrenfeucht-Fraisse game characterisation of NTIME(n).


[C3]
An Ehrenfeucht-Fraïssé approach to collapse results for first-order queries over embedded databases. Clemens Lautemann and Nicole Schweikardt. STACS'01: 18th Annual Symposium on Theoretical Aspects of Computer Science, volume 2010 of Lecture Notes in Computer Science, pages 455-466, Dresden, Germany, February 2001. Springer.
Download: ps.gz (86 kB), ps (225 kB), pdf (208 kB). 
Full version: Technical Report 01/2001, Institut für Informatik, Johannes Gutenberg-Universität Mainz, Germany, 2003.
Download: ps.gz (126 kB), ps (538 kB), pdf (296 kB).

Abstract:
We present a new proof technique for collapse results for first-order queries on databases which are embedded in the natural numbers N or in the positive real numbers Rpos. Our proofs are by means of an explicitly constructed winning strategy for the duplicator, one of the two players in an Ehrenfeucht-Fraisse game. This game-theoretic approach allows us to deal with certain infinite databases where previous, highly involved methods fail. Our main result is that first-order logic has the natural-generic collapse over the structure (N,<,+) for arbitrary (i.e., possibly infinite) databases. Furthermore, a first application of this result shows the natural-generic collapse of first-order logic over the structure (Rpos,<,+) for a certain kind of databases over Rpos which consist of a possibly infinite number of regions.


[C4]
The Crane Beach Conjecture. David A. Mix Barrington, Neil Immerman, Clemens Lautemann, Nicole Schweikardt and Denis Thérien. LICS'01: 16th Annual IEEE Symposium on Logic in Computer Science, pages 187-196, Boston, Massachusetts, USA, June 2001. IEEE Computer Society.
Download: ps.gz (126 kB), ps (538 kB), pdf (296 kB).

Abstract:
A language L over an alphabet A is said to have a neutral letter if there is a letter e in A such that inserting or deleting e's from any word in A* does not change its membership (or non-membership) in L. The presence of a neutral letter affects the definability of a language in first-order logic. It was conjectured that it renders all numerical predicates apart from the order predicate useless, i.e., that if a language L with a neutral letter is not definable in first-order logic with linear order, then it is not definable in first-order logic with any set N of numerical predicates.
We investigate this conjecture in detail, showing that it fails already for N = {+,*}, or, possibly stronger, for any set N that allows counting up to the m times iterated logarithm, for any constant m. On the positive side, we prove the conjecture for the case of all monadic numerical predicates, for N = {+}, for the fragment BC(\Sigma_1) of first-order logic, and for binary alphabets.


[C5]
The natural order-generic collapse for omega-representable databases over the rational and the real ordered group. Nicole Schweikardt. CSL'01: 10th Annual Conference of the European Association for Computer Science Logic, volume 2142 of Lecture Notes in Computer Science, pages 130-144, Paris, France, September 2001. Springer.
Download: ps.gz (104 kB), ps (288 kB), pdf (235 kB).
Full version: Technical Report 02/2001, Institut für Informatik, Johannes Gutenberg-Universität Mainz, Germany, 2003.
Download: ps.gz (112 kB), ps (314 kB), pdf (278 kB).

Abstract:
We consider order-generic queries, i.e., queries that commute with every order-preserving automorphism of a structure's universe. It is well-known that first-order logic has the natural order-generic collapse over the rational and the real ordered group for the class of dense order constraint databases (also known as finitely representable databases). I.e., on this class of databases over (Q,<) or (R,<), addition does not add to the expressive power of first-order logic for defining order-generic queries. In the present paper we develop a natural generalization of the notion of finitely representable databases, where an arbitrary (i.e., possibly infinite) number of regions is allowed. We call these databases omega-representable, and we prove the natural order-generic collapse over the rational and the real ordered group for this larger class of databases.


[C6]
Comparing the succinctness of monadic query languages over finite trees. Martin Grohe and Nicole Schweikardt. CSL'03: 12th Annual Conference of the European Association for Computer Science Logic, volume 2803 of Lecture Notes in Computer Science, pages 226-240, Vienna, Austria, August 2003. Springer.
Download: ps.gz (63 kB), ps (161 kB), pdf (177 kB).
Full version: Technical Report EDI-INF-RR-0168, School of Informatics, University of Edinburgh, Scotland, U.K., 2003.
Download: ps.gz (745 kB), ps (1745 kB), pdf (345 kB).

Abstract:
We study the succinctness of monadic second-order logic and a variety of monadic fixed point logics on trees. All these languages are known to have the same expressive power on trees, but some can express the same queries much more succinctly than others. For example, we show that, under some complexity theoretic assumption, monadic second-order logic is non-elementarily more succinct than monadic least fixed point logic, which in turn is non-elementarily more succinct than monadic datalog.
Succinctness of the languages is closely related to the combined and parameterized complexity of query evaluation for these languages.


[C7]
The succinctness of first-order logic on linear orders. Martin Grohe and Nicole Schweikardt. LICS'04: 19th Annual IEEE Symposium on Logic in Computer Science, pages 438-447, Turku, Finland, July 2004. IEEE Computer Society.
Download: ps.gz (120 kB), ps (162 kB), pdf (177 kB).
Full version: ps.gz (147 kB), ps (380 kB), pdf (201 kB).

Abstract:
Succinctness is a natural measure for comparing the strength of different logics. Intuitively, a logic L is more succinct than another logic L' if all properties that can be expressed in L' can be expressed in L by formulas of (approximately) the same size, but some properties can be expressed in L by (significantly) smaller formulas. We study the succinctness of logics on linear orders that have the same expressive power as first-order logic. Our first theorem is concerned with the finite variable fragments of first-order logic. We prove that:
  1. Up to a polynomial factor, the 2- and the 3-variable fragments of first-order logic on linear orders have the same succinctness.
  2. The 4-variable fragment is exponentially more succinct than the 3-variable fragment.
Our second main result compares the succinctness of first-order logic on linear orders with that of monadic second-order logic. We prove that the fragment of monadic second-order logic that has the same expressiveness as first-order logic on linear orders is non-elementarily more succinct than first-order logic.


[C8]
On the expressive power of monadic least fixed point logic. Nicole Schweikardt. ICALP'04: 31st International Colloquium on Automata, Languages and Programming, volume 3142 of Lecture Notes in Computer Science, pages 1123-1135, Turku, Finland, July 2004. Springer-Verlag.
Download: ps.gz (119 kB), ps (284 kB), pdf (144 kB).
Full version: ps.gz (177 kB), ps (433 kB), pdf (237 kB).

Abstract:
Monadic least fixed point logic MLFP is a natural logic whose expressiveness lies between that of first-order logic FO and monadic second-order logic MSO. In this paper we take a closer look at the expressive power of MLFP. Our results are:
  1. MLFP can describe graph properties beyond any fixed level of the monadic second-order quantifier alternation hierarchy.
  2. On strings with built-in addition, MLFP can describe at least all languages that belong to the linear time complexity class DLIN.
  3. Settling the question whether addition-invariant MLFP = addition-invariant MSO on finite strings would solve open problems in complexity theory: "=" would imply that PH = PTIME whereas "≠" would imply that DLIN ≠ LinH.


[C9]
Schema-based Scheduling of Event Processors and Buffer Minimization for Queries on Structured Data Streams. Christoph Koch, Stefanie Scherzinger, Nicole Schweikardt, and Bernhard Stegmaier. VLDB'04: 30th International Conference on Very Large Data Bases, pages 228-239, Toronto, Canada, August/September 2004. Morgan Kaufmann.
Download: pdf (220 kB).
Full version: CoRR Report, arXiv:cs.DB/0406016, 07 June 2004, 14 pages.   Download: ps (281 kB)

Abstract:
We introduce an extension of the XQuery language, FluX, that supports event-based query processing and the conscious handling of main memory buffers. Purely event-based queries of this language can be executed on streaming XML data in a very direct way. We then develop an algorithm that allows to efficiently rewrite XQueries into the event-based FluX language. This algorithm uses order constraints from a DTD to schedule event handlers and to thus minimize the amount of buffering required for evaluating a query. We discuss the various technical aspects of query optimization and query evaluation within our framework. This is complemented with an experimental evaluation of our approach (see also the demo paper [C10]).


[C10]
FluXQuery: An Optimizing XQuery Processor for Streaming XML Data. Christoph Koch, Stefanie Scherzinger, Nicole Schweikardt, and Bernhard Stegmaier. Demo Paper. VLDB'04: 30th International Conference on Very Large Data Bases, pages 1309-1312, Toronto, Canada, August/September 2004. Morgan Kaufmann.
Download: pdf (101 kB).

Abstract:
In this demonstration, we present the FluXQuery engine for XML streams. Optimization in FluXQuery is based on a new internal query language called FluX and presented in the conference contribution [C9]. The FluX language slightly extends the main structures of XQuery by a construct for event-based query processing. By allowing for the conscious use of main memeory buffers, it supports reasoning over the employment of buffers during query evaluation.


[C11]
Tight Lower Bounds for Query Processing on Streaming and External Memory Data. Martin Grohe, Christoph Koch, and Nicole Schweikardt. ICALP'05: 32nd International Colloquium on Automata, Languages and Programming, volume 3580 of Lecture Notes in Computer Science, pages 1076-1088. Lisbon, Portugal, July 2005. Springer-Verlag.
Download: pdf (182 kB).   Received the Best Paper Award at ICALP'05, Track B.
Full version: CoRR Report, arXiv:cs.DB/0505002, 29 April 2005, 25 pages.   Download: pdf (303 kB).

Abstract:
We study a clean machine model for external memory and stream processing. This model employs a Turing machine with an input tape (corresponding to an external memory device such as, e.g., a hard disk) and a number of work tapes (corresponding to the main memory buffers of a database management system, and therefore usually small compared to the input).
We show that the number of scans of the external data induces a strict hierarchy (as long as work space is sufficiently small, e.g., polylogarithmic in the size of the input). We also show that neither joins nor sorting are feasible if the product of the number r(n) of scans of the external memory and the size s(n) of the internal memory buffers is sufficiently small, e.g., of size o(\sqrt[5]{n}). We also establish tight bounds for the complexity of XPath evaluation and filtering.


[C12]
Lower Bounds for Sorting with Few Random Accesses to External Memory. Martin Grohe and Nicole Schweikardt. PODS'05: 24th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 238-249, Baltimore, USA, June 2005. ACM.
Download: ps.gz (104 kB), ps (260 kB), pdf (164 kB).

Abstract:
We consider a scenario where we want to query a large dataset that is stored in external memory and does not fit into main memory. The most constrained resources in such a situation are the size of the main memory and the number of random accesses to external memory. We note that sequentially streaming data from external memory through main memory is much less prohibitive.
We propose an abstract model of this scenario in which we restrict the size of the main memory and the number of random accesses to external memory, but do not restrict sequential reads. A distinguishing feature of our model is that it admits the usage of unlimited external memory for storing intermediate results, such as several hard disks that can be accessed in parallel. In practice, such auxiliary external memory can be crucial. For example, in a first sequential pass the data can be annotated, and in a second pass this annotation can be used to answer the query. Koch's (VLDB'03) ARB system for answering XPath queries is based on such a strategy.
In this model, we prove lower bounds for sorting the input data. As opposed to related results for models without auxiliary external memory for intermediate results, we cannot rely on communication complexity to establish these lower bounds. Instead, we simulate our model by a non-uniform computation model for which we can establish the lower bounds by combinatorial means.


[C13]
The Expressive Power of Two-Variable Least Fixed-Point Logics Martin Grohe, Stephan Kreutzer, and Nicole Schweikardt. MFCS'05: 30th International Symposium on Mathematical Foundations of Computer Science, volume 3618 of Lecture Notes in Computer Science, pages 422-434 Gdansk, Poland, August/September 2005. Springer-Verlag.
Download: ps.gz (1192 kB), ps (449 kB), pdf (183 kB).

Abstract:
This paper give a classification of the expressive power of two-variable least fixed-point logics. The main results are:
  1. The two-variable fragment of monadic least fixed-point logic with parameters is as expressive as full monadic least fixed-point logic (on binary structures).
  2. The two-variable fragment of monadic least fixed-point logic without parameters is as expressive as the two-variable fragment of binary least fixed-point logic without parameters.
  3. The two-variable fragment of binary least fixed-point logic with parameters is strictly more expressive than the two-variable fragment of monadic least fixed-point logic with parameters (even on finite strings).


[C14]
The Complexity of Querying External Memory and Streaming Data. Martin Grohe, Christoph Koch, and Nicole Schweikardt. FCT'05: 15th International Symposium on Fundamentals of Computation Theory, volume 3623 of Lecture Notes in Computer Science, pages 1-16. Lübeck, Germany, August 2005. Springer-Verlag.
Download: ps.gz (192 kB), ps (456 kB), pdf (213 kB).

Abstract:
We review a recently introduced computation model for streaming and external memory data. An important feature of this model is that it distinguishes between sequentially reading (streaming) data from external memory (through main memory) and randomly accessing external memory data at specific memory locations; it is well-known that the latter is much more expensive in practice. We explain how a number of lower bound results are obtained in this model and how they can be applied for proving lower bounds for XML query processing.


[C15]
Randomized Computations on Large Data Sets: Tight Lower Bounds. Martin Grohe, André Hernich, and Nicole Schweikardt. PODS'06: 25th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 243-252, Chicago, USA, June 2006. ACM.
Download: pdf (208 kB).
Full version: CoRR Report, arXiv:cs.DB/0703081, March 2007, 29 pages.   Download: ps (524 kB), pdf (356 kB).

Abstract:
We study the randomized version of a computation model that restricts random access to external memory and internal memory space. Essentially, this model can be viewed as a powerful version of a data stream model that puts no cost on sequential scans of external memory (as other models for data streams) and, in addition, (like other external memory models, but unlike streaming models), admits several large external memory devices that can be read and written to in parallel.
We obtain tight lower bounds for the decision problems set equality, multiset equality, and checksort. More precisely, we show that any randomized one-sided-error bounded Monte Carlo algorithm for these problems must perform Omega(log N) random accesses to external memory devices, provided that the internal memory size is at most O(\sqrt[4]{N}/log N), where N denotes the size of the input data.
From the lower bound on the set equality problem we can infer lower bounds on the worst case data complexity of query evaluation for the languages XQuery, XPath, and relational algebra on streaming data. More precisely, we show that there exist queries in XQuery, XPath, and relational algebra, such that any (randomized) Las Vegas algorithm that evaluates these queries must perform Omega(log N) random accesses to external memory devices, provided that the internal memory size is at most O(\sqrt[4]{N}/log N).


[C16]
Approximation Schemes for First-Order Definable Optimisation Problems. Anuj Dawar, Martin Grohe, Stephan Kreutzer, and Nicole Schweikardt. LICS'06: 21st Annual IEEE Symposium on Logic in Computer Science, pages 411-420, Seattle, USA, August 2006. IEEE Computer Society.
Abstract: here.   Download: ps (259 kB), pdf (152 kB).   Full version: ps (314 kB), pdf (190 kB).

Abstract:
Let phi(X) be a first-order formula in the language of graphs that has a free set variable X, and assume that X only occurs positively in phi(X). Then a natural minimisation problem associated with phi(X) is to find, in a given graph G, a vertex set S of minimum size such that G satisfies phi(S). Similarly, if X only occurs negatively in phi(X), then phi(X) defines a maximisation problem. Many well-known optimisation problems are first-order definable in this sense, for example MINIMUM DOMINATING SET or MAXIMUM INDEPENDENT SET.
We prove that for each class C of graphs with excluded minors, in particular for each class of planar graphs, the restriction of a first-order definable optimisation problem to the class C has a polynomial time approximation scheme.
A crucial building block of the proof of this approximability result is a version of Gaifman's locality theorem for formulas positive in a set variable. This result may be of independent interest.


[C17]
Database Query Processing Using Finite Cursor Machines. Martin Grohe, Yuri Gurevich, Dirk Leinders, Nicole Schweikardt, Jerzy Tyszkiewicz, Jan Van den Bussche. ICDT'07: 11th International Conference on Database Theory, volume 4353 of Lecture Notes in Computer Science, pp. 284-298, Barcelona, Spain, January 2007. Springer-Verlag.
Download: pdf (447 kB).

Abstract:
We introduce a new abstract model of database query processing, finite cursor machines, that incorporates certain data streaming aspects. The model describes quite faithfully what happens in so-called "one-pass" and "two-pass query processing". Technically, the model is described in the framework of abstract state machines. Our main results are upper and lower bounds for processing relational algebra queries in this model, specifically, queries of the semijoin fragment of the relational algebra.


[C18]
Machine models and lower bounds for query processing. Nicole Schweikardt. PODS'07: 26th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 41-52, Beijing, China, June 2007. ACM.
Download: ps (420 kB), pdf (247 kB), slides (pdf) of my talk at PODS'07 (759 kB).

Abstract:
Two different scenarios for the processing of massive amounts of data have been extensively studied in recent years:

1. External memory processing: This scenario considers data that is stored in external memory. When processing such data, the resulting input/output communication between fast internal memory and slower external memory is a major performance bottleneck. There has been a wealth of research on the design and analysis of so-called external memory algorithms which aim at optimizing the costs produced by external memory accesses.

2. Data stream processing: This scenario considers data that is not stored but, instead, arrives as a stream and has to be processed on-the-fly by using only a limited amount of memory. Typical application areas for which data stream processing is relevant are, e.g., IP network traffic analysis or processing meteorological data generated by sensor networks.

This paper gives an overview of some recent work concerning the theoretical foundations of these two scenarios. The main focus is on generalizations of the classical data stream model where, apart from an internal memory of limited size, also a number of (potentially huge) streams may be used as external memory devices.


[C19]
CWA-Solutions for Data Exchange Settings with Target Dependencies. André Hernich and Nicole Schweikardt. PODS'07: 26th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 113-122, Beijing, China, June 2007. ACM.
Download: ps (692 kB), pdf (234 kB).

Abstract:
Data exchange deals with the following problem: given an instance over a source schema, a specification of the relationship between the source and the target, and dependencies on the target, construct an instance over a target schema that satisfies the given relationships and dependencies. Recently - for data exchange settings without target dependencies - Libkin (PODS'06) introduced a new concept of solutions based on the closed world assumption (so called CWA-solutions), and showed that, in some respects, this new notion behaves better than the standard notion of solutions considered in previous papers on data exchange.
The present paper extends Libkin's notion of CWA-solutions to data exchange settings with target dependencies. We show that, when restricting attention to data exchange settings with weakly acyclic target dependencies, this new notion behaves similarly as before: the core is the unique "minimal" CWA-solution, and computing CWA-solutions as well as certain answers to positive queries is possible in polynomial time and can be PTIME-hard. However, there may be more than one "maximal" CWA-solution. And going beyond the class of positive queries, we obtain that there are conjunctive queries with (just) one inequality, for which evaluating the certain answers is coNP-hard. Finally, we consider the EXISTENCE-OF-CWA-SOLUTIONS problem: while the problem is tractable for data exchange settings with weakly acyclic target dependencies, it turns out to be undecidable for general data exchange settings. As a consequence, we obtain that also the EXISTENCE-OF-UNIVERSAL-SOLUTIONS problem is undecidable in general.


[C20]
Boundedness of Monadic FO over Acyclic Structures. Stephan Kreutzer, Martin Otto, and Nicole Schweikardt. ICALP'07: 34th International Colloquium on Automata, Languages and Programming, volume 4596 of Lecture Notes in Computer Science, pages 1076-1088. Wroclaw, Poland, July 2007. Springer-Verlag.
Download: ps (214 kB), pdf (129 kB).

Abstract:
We study the boundedness problem for monadic least fixed points as a decision problem. While this problem is known to be undecidable in general and even for syntactically very restricted classes of underlying first-order formulae, we here obtain a decidability result for the boundedness issue for monadic fixed points over arbitrary first-order formulae in restriction to acyclic structures.


[C21]
Model theory makes formulas large. Anuj Dawar, Martin Grohe, Stephan Kreutzer, and Nicole Schweikardt. ICALP'07: 34th International Colloquium on Automata, Languages and Programming, volume 4596 of Lecture Notes in Computer Science, pages 1076-1088. Wroclaw, Poland, July 2007. Springer-Verlag.
Download: ps (199 kB), pdf (126 kB).

Abstract:
Gaifman's locality theorem states that every first-order sentence is equivalent to a local sentence. We show that there is no elementary bound on the length of the local sentence in terms of the original. Gaifman's theorem is an essential ingredient in several algorithmic meta theorems for first order logic. Our result has direct implications for the running time of the algorithms.
The classical Los-Tarski theorem states that every first-order sentence preserved under extensions is equivalent to an existential sentence. We show that there is no elementary bound on the length of the existential sentence in terms of the original. Recently, variants of the Los-Tarski theorem have been proved for certain classes of finite structures, among them the class of finite trees and more generally classes of structures of bounded tree width. Our lower bound also applies to these variants.
The first-order theory of trees is decidable. We prove that there is no elementary decision algorithm.
All these results are based on a similar technique of encoding large numbers by trees of small height in such a way that small formulas can speak about these numbers.
Notably, our lower bounds do not apply to restrictions of the results to structures of bounded degree. For such structures, we obtain elementary upper bounds in all cases. However, even there we can prove at least doubly exponential lower bounds.


[C22]
Lower Bounds for Multi-Pass Processing of Multiple Data Streams. Nicole Schweikardt. STACS'09: 26th International Symposium on Theoretical Aspects of Computer Science, pages 51-61. Freiburg, Germany, February 2009.
Download: pdf (109 kB).

Abstract:
This paper gives a brief overview of computation models for data stream processing, and it introduces a new model for multi-pass processing of multiple streams, the so-called mp2s-automata. Two algorithms for solving the set disjointness problem with these automata are presented. The main technical contribution of this paper is the proof of a lower bound on the size of memory and the number of heads that are required for solving the set disjointness problem with mp2s-automata.


Miscellaneous   (in chronological order) :

[M1]
On quantifier elimination in weak direct powers. Malika More and Nicole Schweikardt. Technical report, 24 pages. In Patrick Cegielski, editor, JAF 20: Journées sur les Arithmétiques Faibles (Weak Arithmetic Days), Fontainebleau, France, June 2001.
Download: ps.gz (117 kB), ps (315 kB), pdf (293 kB).

Abstract:
In this paper the following question is investigated: How can quantifier elimination in a given structure A be lifted to quantifier elimination in the weak direct power of A? Given an elimination set for a structure A, our main result provides an explicit list of functions and relations which form an elimination set for the weak direct power of A. This gives a deep understanding of how the expressive power of first-order logic over a weak direct power is related to the expressive power of first-order logic over the factor structure.
Moreover, we do not only consider weak direct powers, but a generalized form of direct powers (considerably different from Feferman's and Vaught's generalized products) which, amongst others, also cover full direct powers and Cartesian products.


Theses   (in chronological order) :

[T1]
The monadic second-order quantifier alternation hierarchy over grids and pictures. Nicole Schweikardt. Diplomarbeit (Master's thesis), Institut für Informatik, Johannes Gutenberg-Universität Mainz, Germany, August 1997, 116 pages.
Download: ps.gz (420 kB), ps (1755 kB), pdf (1359 kB).

Abstract:
The subject of this Master's thesis is the expressive power of monadic second-order logic over grids and pictures. Firstly, some well-known properties concerning monadic second-order logic, automata theory, and, in particular, recognizability of two-dimensional pictures are summarized. Afterwards, the results and proofs of the conference contribution [C1] are presented in detail.


[T2]
On the Expressive Power of First-Order Logic with Built-In Predicates. Nicole Schweikardt. Ph.D. thesis, Fachbereich Mathematik und Informatik, Johannes Gutenberg-Universität Mainz, Germany, December 2001, 237 pages. Published at Logos Verlag Berlin, 2002, ISBN 3-8325-0017-0.
Download: ps.gz (1214 kB), ps (2821 kB), pdf (4567 kB).   Received the GI-Dissertationspreis 2002.

Abstract:
This Ph.D. thesis investigates the expressive power of first-order logic on structures with built-in predicates such as the linear ordering and the arithmetic predicates of addition and multiplication. The overall question is: What happens if we enlarge the expressive power of first-order logic by adding certain built-in predicates or by adding the ability to count cardinalities of sets? The main results of this thesis can be subdivided into the following three topics:

1. Arithmetic and counting quantifiers: We show that Presburger arithmetic is closed under unary counting quantifiers. As an application we obtain an easy proof of Ruhl's result that reachability (and similarly, connectivity) in finite graphs is not expressible in first-order logic with unary counting quantifiers and built-in addition. Furthermore, the above result on Presburger arithmetic helps to show the failure of a particular version of the Crane Beach Conjecture.

2. The Crane Beach Conjecture: Here we consider string-languages that have a neutral letter, i.e., a letter that may be inserted or deleted in any string without changing its membership or non-membership in the language. In its parameterized version, the Crane Beach Conjecture (the CBC, for short) is said to be true for a class F of logical formulas and a class BIP of built-in predicates if and only if F(<) can express all neutral letter languages that are expressible in F(<,BIP). In the present thesis several negative, as well as a number of positive results on this parameterized version of the CBC are obtained. For example, we show that the CBC is true for first-order logic with built-in addition and a particular set of built-in unary predicates. This also implies a lower bound with respect to the circuit complexity class AC0: Neutral letter languages that are not star-free regular do not belong to a particular uniform version of AC0.
In its original and unrestricted version the CBC states that the single built-in predicate of linear ordering suffices to express all neutral letter languages that are definable in first-order logic with arbitrary built-in predicates. This unrestricted version was falsified by Immerman. A result of the present thesis shows the extent to which this unrestricted version of the CBC is false: There does not even exist a countable set of built-in predicates that suffices to express all neutral letter languages that are definable in first-order logic with arbitrary built-in predicates. The proof of this result also provides a new intuition that gives a better understanding of what exactly causes the failure of the CBC.

3. Collapse results in database theory: Similarly as of the CBC, one speaks of a collapse result if a particular set of built-in predicates does not enable first-order logic to express more so-called <-generic database queries than the linear ordering alone. Most collapse results known so far restrict attention to finite databases. In the present thesis collapse results are obtained also for certain classes of infinite databases: Via an Ehrenfeucht-Fraïssé game argument we obtain the collapse for arbitrary databases embedded in context structures with the universe N of natural numbers and with certain built-in predicates. In some sense these results can be transferred from the discretely ordered universe N to densely ordered universes such as the set R of real numbers: There, the according collapse results are valid for all N-embeddable databases, i.e., for all databases whose active domain can be embedded into the natural numbers via a <-preserving mapping. We also show a lifting theorem that allows to lift collapse results for N-embeddable databases to the larger class of N-representable databases.


Succinct list of Nicole Schweikardt's publications: 
here
Back to Nicole Schweikardt's homepage: 
English / deutsch

Nicole Schweikardt