Discussion:
[dev] [RFC] Design of a vim like text editor
(too old to reply)
Marc André Tanner
2014-09-13 14:01:15 UTC
Permalink
TLDR: I'm writing an experimental but (hopefully) highly efficient vim
like text editor based on a piece chain data structure. You will find
an url to a git repository at the end of this rather long mail.

Help welcome!

Why another text editor?
========================

It all started when I was recently reading the excellent Project Oberon[0],
where in chapter 5 a data structure for managing text is introduced.
I found this rather appealing and wanted to see how it works in practice.

After some time I decided that besides just having fun hacking around I
might as well build something which could (at least in the long run)
replace my current editor of choice: vim.

This should be accomplished by a reasonable amount of clean (your mileage
may vary), modern and legacy free C code. Certainly not an old, 500'000
lines[1] long, #ifdef cluttered mess which tries to run on all broken
systems ever envisioned by mankind.

Admittedly vim has a lot of functionally, most of which I don't use. I
therefore set out with the following main goals:

- Unicode aware

- binary clean

- handle arbitrary files (this includes large ones, think >100M SQL-dumps)

- unlimited undo/redo support

- syntax highlighting

- regex search (and replace)

- multiple file/window support

- extensible and configurable through familiar config.def.h mechanism

The goal could thus be summarized as "80% of vim's features (in other
words the useful ones) implemented in roughly 1% of the code".

Finally and most importantly it is fun! Writing a text editor presents
some interesting challenges and design decisions, some of which are
explained below.

Text management using a piece table/chain
=========================================

The core of this editor is a persistent data structure called a piece
table which supports all modifications in O(m), where m is the number
of non-consecutive editing operations. This bound could be further
improved to O(log m) by use of a balanced search tree, however the
additional complexity doesn't seem to be worth it, for now.

The actual data is stored in buffers which are strictly append only.
There are two types of buffers, a fixed-sized for the original file
content and append-only ones one for all modifications.

A text, i.e. a sequence of bytes, is represented as a double linked
list of pieces each with a pointer into a buffer and an associated
length. Pieces are never deleted but instead always kept around for
redo/undo support. A span is a range of pieces, consisting of a start
and end piece. Changes to the text are always performed by swapping
out an existing, possibly empty, span with a new one.

An empty document is represented by two special sentinel pieces which
always exist:

/-+ --> +-\
| | | |
\-+ <-- +-/
#1 #2

Loading a file from disk is as simple as mmap(2)-ing it into a buffer,
creating a corresponding piece and adding it to the double linked list.
Hence loading a file is a constant time operation i.e. independent of
the actual file size (assuming the operating system uses demand paging).

/-+ --> +-----------------+ --> +-\
| | | I am an editor! | | |
\-+ <-- +-----------------+ <-- +-/
#1 #3 #2

Insert
------

Inserting a junk of data amounts to appending the new content to a
modification buffer. Followed by the creation of new pieces. An insertion
in the middle of an existing piece requires the creation of 3 new pieces.
Two of them hold references to the text before respectively after the
insertion point. While the third one points to the newly added text.

/-+ --> +---------------+ --> +----------------+ --> +--+ --> +-\
| | | I am an editor| |which sucks less| |! | | |
\-+ <-- +---------------+ <-- +----------------+ <-- +--+ <-- +-/
#1 #4 #5 #6 #2

modification buffer content: "which sucks less"

During this insertion operation the old span [3,3] has been replaced
by the new span [4,6]. Notice that the pieces in the old span were not
changed, therefore still point to their predecessors/successors, and can
thus be swapped back in.

If the insertion point happens to be at a piece boundary, the old span
is empty, and the new span only consists of the newly allocated piece.

Delete
------

Similarly a delete operation splits the pieces at appropriate places.

/-+ --> +-----+ --> +--+ --> +-\
| | | I am| |! | | |
\-+ <-- +-----+ <-- +--+ <-- +-/
#1 #7 #6 #2

Where the old span [4,5] got replaced by the new span [7,7]. The underlying
buffers remain unchanged.

Cache
-----

Notice that the common case of appending text to a given piece is fast
since, the new data is simply appended to the buffer and the piece length
is increased accordingly. In order to keep the number of pieces down,
the least recently edited piece is cached and changes to it are done
in place (this is the only time buffers are modified in a non-append
only way). As a consequence they can not be undone.

Undo/redo
---------

Since the buffers are append only and the spans/pieces are never destroyed
undo/redo functionality is implemented by swapping the required spans/pieces
back in.

As illustrated above, each change to the text is recorded by an old and
a new span. An action consists of multiple changes which logically belong
to each other and should thus also be reverted together. For example
a search and replace operation is one action with possibly many changes
all over the text. Actions are kept in an undo respectively redo stack.

A state of the text can be marked by means of a snapshotting operation.
The undo/redo functionality operates on such marked states and switches
back and forth between them.

The history is currently linear, no undo / history tree is implemented.

Properties
----------

The main advantage of the piece chain as described above is that all
operations are performed independent of the file size but instead linear
in the number of pieces i.e. editing operations. The original file buffer
never changes which means the mmap(2) can be performed read only which
makes optimal use of the operating system's virtual memory / paging system.

The maximum editable file size is limited by the amount of memory a process
is allowed to map into its virtual address space, this shouldn't be a problem
in practice. The whole process assumes that the file can be used as is.
In particular the editor assumes all input and the file itself is encoded
as UTF-8. Supporting other encodings would require conversion using iconv(3)
or similar upon loading and saving the document, which defeats the whole
purpose.

Similarly the editor has to cope with the fact that lines can be terminated
either by \n or \n\r. There is no conversion to a line based structure in
place. Instead the whole text is exposed as a sequence of bytes. All
addressing happens by means of zero based byte offsets from the start of
the file.

The main disadvantage of the piece chain data structure is that the text
is not stored contiguous in memory which makes seeking around somewhat
harder. This also implies that standard library calls like regex(3)
functions can not be used as is. However this is the case for all but
the most simple data structures used in text editors.

Screen Drawing
==============

The current code takes a rather simple approach to screen drawing. It
basically only remembers the starting position of the area being shown.
Then fetches a "screen full" of bytes and outputs one character at a
time until the end of the window is reached. A consequence of this
approach is that lines are always wrapped and horizontal scrolling is
not supported.

No efforts are made to reduce the terminal output. This task is delegated
to the underlying curses library which already performs a kind of double
buffering. The window is always redrawn completely even if only a single
character changes. It turns out this is actually necessary if one wants
to support multiline syntax highlighting.

While outputting the individual characters a cell matrix is populated
where each entry stores the length in bytes of the character displayed
at this particular cell. For characters spanning multiple columns the
length is always stored in the leftmost cell. As an example a tab has a
length of 1 byte followed by up to 7 cells with a length of zero.
Similarly a \n\r line ending occupies only one screen cell but has a
length of 2.

This matrix is actually stored per line inside a double linked list of
structures representing screen lines. For each line we keep track of
the length in bytes of the underlying text, the display width of all
characters part of the line, and the logical line number.

All cursor positioning is always performed in bytes from the start of
the file and works by traversing the double linked list of screen lines
until the correct line is found. Then the cell array is consulted to
move to the correct column.

Syntax-Highlighting
-------------------

The editor takes a similar regex-based approach to syntax highlighting
than sandy and reuses its syntax definitions but always applies them to
a "screen full" of text thus enabling multiline coloring.

Currently only the highlighting rules for C have been imported from sandy.

Window-Management
-----------------

It is possible to open multiple windows via the :split/:vsplit/:open
commands or by passing multiple files on the command line.

In principle it would be nice to follow a similar client/server approach
as sam/samterm i.e. having the main editor as a server and each window
as a separate client process with communication over a unix domain socket.

That way window management would be taken care of by dwm or dvtm and the
different client processes would still share common cut/paste registers
etc.

However at the moment I don't want to open that can of worms and instead
settled for a single process architecture.

Search and replace
------------------

This is one of the last big conceptual problems.

Currently the editor copies the whole text to a contiguous memory block
and then uses the standard regex functions from libc. Clearly this is not
a satisfactory solution for large files and kind of defeats the whole
effort spent on the piece table.

The long term solution is to write our own regular expression engine or
modify an existing one to make use of the iterator API. This would allow
efficient search without having to double memory consumption. At some
point I will have to (re)read the papers of Russ Cox[2] and Rob Pike
about this topic.

Command-Prompt
--------------

The editor needs some form of command prompt to get user input
(think :, /, ? in vim).

At first I wanted to implement this in terms of an external process,
similar to the way it is done in sandy with communication back to the
editor via a named pipe.

At some point I thought it would be possible to provide all editor commands
as shell scripts in a given directory, then set $PATH accordingly and run
the shell. This would give us readline editing, tab completion, history and
Unicode support for free. But unfortunately it won't work due to quoting
issues and other conflicts of special symbols with different meanings.

Later it occurred to me that the editor prompt could just be treated as
special 1 line file. That is all the main editor functionality is reused
with a slightly different set of key bindings.

This approach also has the added benefit of further testing the main editor
component (in particular corner cases like editing at the end of the file).

Editor Frontends
================

The editor core is written in a library like fashion which should make
it possible to write multiple frontends with possibly different user
interfaces/paradigms.

At the moment there exists a barely functional, non-modal nano/sandy
like interface which was used during early testing. The default interface
is a vim clone called vis.

The frontend to run is selected based on the executable name.

Key binding modes
-----------------

The available key bindings for the different modes are arranged in a
hierarchical way in config.h (there is also an ascii tree giving an
overview in that file). Each mode can specify a parent mode which is
used to look up a key binding if it is not found in the current mode.
This reduces redundancy for keys which have the same meaning in
different modes.

Each mode can also provide hooks which are executed upon entering/leaving
the mode and when there was an unmatched key.

vis a vim like frontend
-----------------------

The vis frontend uses a similar approach to the one suggested by Markus
Teich[3] but it turns out to be a bit more complicated. For starters
there are movements and commands which consist of more than one key/
character. As a consequence the key lookup is not a simple array
dereference but instead the arrays are looped over until a match
is found.

The following section gives a quick overview over various vim features
and their current support in vis.

Operators
---------
working: d (delete), c (change), y (yank), p (put)
planned: > (shift-right), < (shift-left)
those depend on handling of indention tabs <-> spaces

Movements
---------
h (char left)
l (char right)
j (line down)
k (line up)
0 (start of line)
^ (first non-blank of line)
g_ (last non-blank of line)
$ (end of line)
% (match bracket)
b (previous start of a word)
w (next start of a word)
e (next end of a word)
ge (previous end of a word)
{ (previous paragraph)
} (next paragraph)
( (previous sentence)
) (next sentence)
gg (begin of file)
G (goto line or end of file)
| (goto column)
n (repeat last search forward)
N (repeat last search backwards)
f{char} (to next occurrence of char to the right)
t{char} (till before next occurrence of char to the right)
F{char} (to next occurrence of char to the left)
T{char} (till before next occurrence of char to the left)
/{text} (to next match of text in forward direction)
?{text} (to next match of text in backward direction)

There is currently no distinction between what vim calls a WORD and
a word, only the former is implemented. Though infrastructure for
the latter also exists.

The semantics of a paragraph and a sentence is also not always 100%
the same as in vim.

Some of these commands do not work as in vim when prefixed with a
digit i.e. a multiplier. As an example 3$ should move to the end
of the 3rd line down. The way it currently behaves is that the first
movement places the cursor at the end of the current line and the last
two have thus no effect.

In general there are still a lot of improvements to be made in the
case movements are forced to be line or character wise. Also some of
them should be inclusive in some context and exclusive in others.
At the moment they always behave the same.

Text objects
------------

All of the following text objects are implemented in an inner variant
(prefixed with 'i') and a normal variant (prefixed with 'a'):

w word
s sentence
p paragraph
[,], (,), {,}, <,>, ", ', ` block enclosed by these symbols

For word, sentence and paragraph there is no difference between the
inner and normal variants.

Modes
-----

At the moment there exists a more or less functional insert, replace
and character wise visual mode.

A line wise visual mode is planned.

Marks
-----

Only the 26 lower case marks [a-z] are supported. No marks across files
are supported. Marks are not preserved over editing sessions.

Registers
---------

Only the 26 lower case registers [a-z] and 1 additional default register
is supported.

Undo/Redo and Repeat
--------------------

The text is currently snapshoted whenever an operator is completed as
well as when insert or replace mode is left. Additionally a snapshot
is also taken if in insert or replace mode a certain idle time elapses.

Another idea is to snapshot based on the distance between two consecutive
editing operations (as they are likely unrelated and thus should be
individually reversible).

The repeat command '.' currently only works for operators. This for
example means that inserting text can not be repeated (i.e. inserted
again). The same restriction also applies to commands which are not
implemented in terms of operators, such as 'o', 'O', 'J' etc.

Command line prompt
-------------------

At the ':'-command prompt only the following commands are recognized:

:nnn go to line nnn
:edit replace current file with a new one or reload it from disk
:open open a new window
:qall close all windows, exit editor
:quit close currently focused window
:read insert content of another file at current cursor position
:split split window horizontally
:vsplit split window vertically
:wq write changes then close window
:write write current buffer content to file

The substitute command is recognized but not yet implemented. The '!'
command to filter text through an external program is also planned.
At some point the range syntax should probably also be supported.

History support, tab completion and wildcard expansion are other
worthwhile features.

Tab <-> Space
-------------

Currently there is no expand tab functionality i.e. they are always
inserted as is. For me personally this is no problem at all. Tabs
should be used for indention! That way everybody can configure their
preferred tab width whereas spaces should only be used for alignment.

Jump list and change list
-------------------------

Neither the jump list nor the change lists are currently supported.

Mouse support
-------------

The mouse is currently not used at all.

Other features
--------------

Other things I would like to add in the long term are:

+ code completion: this should be done as an external process. I will
have to take a look at the tools from the llvm / clang project. Maybe
dvtm's terminal emulation support could be reused to display an
slmenu inside the editor at the cursor position?

+ something similar to vim's quick fix functionality

Stuff which vim does which I don't use and have no plans to add:

- GUIs (neither x11, motif, gtk, win32 ...)
- text folding
- visual block mode
- plugins (certainly not vimscript, if anything it should be lua based)
- runtime key bindings
- right-to-left text
- tabs (as in multiple workspaces)
- ex mode
- macro recording

How to help?
------------

At this point it might be best to fetch the code, edit some scratch file,
notice an odd behavior or missing functionality, write and submit a patch
for it, then iterate.

WARNING: There are probably still some bugs left which could corrupt your
unsaved changes. Use at your own risk. At this point I suggest to
only edit non-critical files which are under version control and
thus easily recoverable!

git clone git://repo.or.cz/vis.git

A quick overview over the code structure to get you started:

config.def.h definition of key bindings, commands, syntax highlighting etc.
vis.c vi(m) specific editor frontend, program entry point
editor.[ch] screen / window / statusbar / command prompt management
window.[ch] window drawing / syntax highlighting / cursor placement
text-motions.[ch] movement functions take a file position and return a new one
text-objects.[ch] functions take a file position and return a file range
text.[ch] low level text / marks / {un,re}do / piece table implementation

Hope this gets the interested people started. Feel free to ask questions
if something is unclear! There are still a lot of bugs left to fix, but
by now I'm fairly sure that the general concept should work.

As always, comments and patches welcome!

Cheers,
Marc

[0] http://www.inf.ethz.ch/personal/wirth/ProjectOberon/
[1] https://www.openhub.net/p/vim
[2] http://swtch.com/~rsc/regexp/
[3] http://lists.suckless.org/dev/1408/23219.html
--
Marc André Tanner >< http://www.brain-dump.org/ >< GPG key: CF7D56C0
Christian Neukirchen
2014-09-13 14:39:15 UTC
Permalink
Post by Marc André Tanner
TLDR: I'm writing an experimental but (hopefully) highly efficient vim
like text editor based on a piece chain data structure. You will find
an url to a git repository at the end of this rather long mail.
Funny, I just thought tonight about a variant of vi where the newline is
a real char... you implemented it exactly like this! (E.g. $x is like
J, otoh $rX doesn't work like that...)

o seems to be broken on the last line.

Looks very promising already! Perhaps the editing mode could be shown
in the mode line.

Thanks,
--
Christian Neukirchen <***@gmail.com> http://chneukirchen.org
Marc André Tanner
2014-09-13 15:16:16 UTC
Permalink
Post by Christian Neukirchen
Post by Marc André Tanner
TLDR: I'm writing an experimental but (hopefully) highly efficient vim
like text editor based on a piece chain data structure. You will find
an url to a git repository at the end of this rather long mail.
Funny, I just thought tonight about a variant of vi where the newline is
a real char... you implemented it exactly like this! (E.g. $x is like
J, otoh $rX doesn't work like that...)
Hopefully fixed now. It was that way because $RX (i.e. in replace mode)
new lines should not be overwritten.
Post by Christian Neukirchen
o seems to be broken on the last line.
A lot of stuff is broken on the last line and therefore also on the
command prompt. The problem is that the iterator API currently only
works on [0, size-1] whereas movements to the end of the last line
end up at size.
Post by Christian Neukirchen
Looks very promising already!
Thanks!
Post by Christian Neukirchen
Perhaps the editing mode could be shown in the mode line.
I agree.
--
Marc André Tanner >< http://www.brain-dump.org/ >< GPG key: CF7D56C0
Marc André Tanner
2014-09-14 09:23:56 UTC
Permalink
Post by Christian Neukirchen
o seems to be broken on the last line.
This and other issues related to movements / modifications at end of
the file are likely fixed now. For example Ctrl+w at the end of the
command prompt now also works as expected.
--
Marc André Tanner >< http://www.brain-dump.org/ >< GPG key: CF7D56C0
q***@c9x.me
2014-09-14 15:20:11 UTC
Permalink
Post by Marc André Tanner
Post by Christian Neukirchen
o seems to be broken on the last line.
This and other issues related to movements / modifications at end of
the file are likely fixed now. For example Ctrl+w at the end of the
command prompt now also works as expected.
What the heck is so special with the end of files?


-- Q.
FRIGN
2014-09-13 14:58:35 UTC
Permalink
On Sat, 13 Sep 2014 16:01:15 +0200
Marc André Tanner <***@brain-dump.org> wrote:

Hey Marc,
Post by Marc André Tanner
TLDR: I'm writing an experimental but (hopefully) highly efficient vim
like text editor based on a piece chain data structure. You will find
an url to a git repository at the end of this rather long mail.
your mail made my day! I've read your concept and am delighted by how well
thought-out it is. The piece table is a good approach, which has also been
verified by Charles Crowley in "Data Structures for Text Sequences".
Post by Marc André Tanner
This should be accomplished by a reasonable amount of clean (your mileage
may vary), modern and legacy free C code. Certainly not an old, 500'000
lines[1] long, #ifdef cluttered mess which tries to run on all broken
systems ever envisioned by mankind.
I know _exactly_ what you mean and you are perfectly right. I couldn't wait
for an alternative to vim to show up and intentionally didn't "study" vim
too thoroughly.

I have got one question though: When you are talking about Unicode-awareness,
are you talking about UTF-8 or more complex sets?
Post by Marc André Tanner
It is possible to open multiple windows via the :split/:vsplit/:open
commands or by passing multiple files on the command line.
In principle it would be nice to follow a similar client/server approach
as sam/samterm i.e. having the main editor as a server and each window
as a separate client process with communication over a unix domain socket.
That way window management would be taken care of by dwm or dvtm and the
different client processes would still share common cut/paste registers
etc.
However at the moment I don't want to open that can of worms and instead
settled for a single process architecture.
Going with named pipes or sockets would be the better approach.
Post by Marc André Tanner
- GUIs (neither x11, motif, gtk, win32 ...)
- text folding
- visual block mode
- plugins (certainly not vimscript, if anything it should be lua based)
- runtime key bindings
- right-to-left text
- tabs (as in multiple workspaces)
- ex mode
- macro recording
I agree with all of them. Many "features" in vim evolved simply from the fact
that programming languages like C++ and Java require whole IDE's to be
written (especially the class-like-structure is a curse).
For the other stuff, there should be ways to do it outside the editor.
Post by Marc André Tanner
At this point it might be best to fetch the code, edit some scratch file,
notice an odd behavior or missing functionality, write and submit a patch
for it, then iterate.
Playing around with it I noticed that "dd" doesn't work in the last line
and sometimes mixes up things.
Write a document with "ee"'s in each line. Then do a dd. The last line
won't get deleted and sometimes, a line is removed, but leaves a single "e"
as a trace.
I may send you a patch in the next few days using the great arg.h by
Christoph whenever possible. I also noticed some smaller warnings while
compiling, which should be trivial to fix.

All in all, great work on this piece of software! I see you spent a lot
of time designing and writing the well-commented code.

Cheers

FRIGN
--
FRIGN <***@frign.de>
Marc André Tanner
2014-09-14 09:32:10 UTC
Permalink
Post by FRIGN
I have got one question though: When you are talking about Unicode-awareness,
are you talking about UTF-8 or more complex sets?
Just UTF-8. The internals asume it in some cases to move from one character
to the next.
Post by FRIGN
Post by Marc André Tanner
It is possible to open multiple windows via the :split/:vsplit/:open
commands or by passing multiple files on the command line.
In principle it would be nice to follow a similar client/server approach
as sam/samterm i.e. having the main editor as a server and each window
as a separate client process with communication over a unix domain socket.
That way window management would be taken care of by dwm or dvtm and the
different client processes would still share common cut/paste registers
etc.
However at the moment I don't want to open that can of worms and instead
settled for a single process architecture.
Going with named pipes or sockets would be the better approach.
It can always be reconsidered later on. For now I want to keep it simple and
fix the other issues. My experience with abduco, which implements such a
client / server architecture, showed that it is not completely trivial.
Post by FRIGN
Playing around with it I noticed that "dd" doesn't work in the last line
and sometimes mixes up things.
Write a document with "ee"'s in each line. Then do a dd. The last line
won't get deleted and sometimes, a line is removed, but leaves a single "e"
as a trace.
This should now be fixed, however due to the way new lines are handled
the cursor is not moved a line up.
--
Marc André Tanner >< http://www.brain-dump.org/ >< GPG key: CF7D56C0
Andrew Hills
2014-09-13 15:06:33 UTC
Permalink
Hi Marc,

Thank you for the thorough and illustrated RFC. If you have not already
done so, I suggest you keep this text around with the project.
Post by Marc André Tanner
Notice that the common case of appending text to a given piece is fast
since, the new data is simply appended to the buffer and the piece length
is increased accordingly. In order to keep the number of pieces down,
the least recently edited piece is cached and changes to it are done
in place (this is the only time buffers are modified in a non-append
only way). As a consequence they can not be undone.
This seems like behavior that will surprise me, and possibly even
others. (Possibly using it will demonstrate otherwise.) Is there any
convenient workaround?
Post by Marc André Tanner
The history is currently linear, no undo / history tree is implemented.
Are orphaned pieces (on dead branches) eliminated? Is there any useful
interface for navigating a history tree that would make the feature
worth having?
Post by Marc André Tanner
The editor takes a similar regex-based approach to syntax highlighting
than sandy and reuses its syntax definitions but always applies them to
a "screen full" of text thus enabling multiline coloring.
How does this work when important parts of the syntax are off of the screen?
Post by Marc André Tanner
The repeat command '.' currently only works for operators. This for
example means that inserting text can not be repeated (i.e. inserted
again). The same restriction also applies to commands which are not
implemented in terms of operators, such as 'o', 'O', 'J' etc.
Is this intended, or is the vim-like behavior planned?
Post by Marc André Tanner
+ code completion: this should be done as an external process. I will
have to take a look at the tools from the llvm / clang project. Maybe
dvtm's terminal emulation support could be reused to display an
slmenu inside the editor at the cursor position?
This feature seems unnecessary; do others use this? The last time I had
code completion was in Eclipse, when I was younger and more foolish, and
I don't miss it at all.
Post by Marc André Tanner
- macro recording
Macros are one of my most-used Vim features; they are very useful for
repetitive editing of complex files where regular expressions are more
of a pain. Even if you have no desire for them, would you accept patches
to add the feature, or should this list be considered a blacklist?

Thanks for your work thus far,
Andrew
Markus Teich
2014-09-13 17:09:23 UTC
Permalink
Post by Andrew Hills
Post by Marc André Tanner
+ code completion: this should be done as an external process. I will
have to take a look at the tools from the llvm / clang project. Maybe
dvtm's terminal emulation support could be reused to display an
slmenu inside the editor at the cursor position?
This feature seems unnecessary; do others use this? The last time I had
code completion was in Eclipse, when I was younger and more foolish, and
I don't miss it at all.
Heyho,

I find the vim style completion pretty handy. It just completes words, which it
already knows (from open/included files), so you don't need to implement complex
language- and context-dependent filtering. This can also be used when writing
long texts with multiple occurences of some weirdly hard to type scientific
terms.

--Markus
FRIGN
2014-09-13 17:17:59 UTC
Permalink
On Sat, 13 Sep 2014 10:09:23 -0700
Post by Markus Teich
I find the vim style completion pretty handy. It just completes words, which it
already knows (from open/included files), so you don't need to implement complex
language- and context-dependent filtering. This can also be used when writing
long texts with multiple occurences of some weirdly hard to type scientific
terms.
I personally disagree, but I'm sure this is a matter of taste.
We can all agree on the fact this feature is not a priority.

Cheers

FRIGN
--
FRIGN <***@frign.de>
Marc André Tanner
2014-09-13 17:38:21 UTC
Permalink
Post by Andrew Hills
Hi Marc,
Thank you for the thorough and illustrated RFC. If you have not already
done so, I suggest you keep this text around with the project.
Good idea, I've added it as README for now.
Post by Andrew Hills
Post by Marc André Tanner
Notice that the common case of appending text to a given piece is fast
since, the new data is simply appended to the buffer and the piece length
is increased accordingly. In order to keep the number of pieces down,
the least recently edited piece is cached and changes to it are done
in place (this is the only time buffers are modified in a non-append
only way). As a consequence they can not be undone.
This seems like behavior that will surprise me, and possibly even
others. (Possibly using it will demonstrate otherwise.) Is there any
convenient workaround?
This is just an implementation detail. In practice you will always
be able to get back to the state you had either:

- at the time after your last operator command

- at an idle period of 3 seconds in either insert or replace mode

As I already mentioned, it would also be possible to add a heuristic
based on the distance between consecutive editing operations. In terms
of API whenever you call text_snapshot(...) you will be able to return
to this state.
Post by Andrew Hills
Post by Marc André Tanner
The history is currently linear, no undo / history tree is implemented.
Are orphaned pieces (on dead branches) eliminated?
When you undo a few changes and then start adding new ones, the previously
undone pieces are thrown away. That is no branch and hence no tree is
created.
Post by Andrew Hills
Is there any useful
interface for navigating a history tree that would make the feature
worth having?
vim supports the :earlier, :later commands among other things (see also
:help undo-redo). I believe there is also a plugin to display the history
as some form of a graph/tree. I personally have no use for this stuff.
Post by Andrew Hills
Post by Marc André Tanner
The editor takes a similar regex-based approach to syntax highlighting
than sandy and reuses its syntax definitions but always applies them to
a "screen full" of text thus enabling multiline coloring.
How does this work when important parts of the syntax are off of the screen?
At the moment not at all. But you can start reading and coloring a portion
of the text before what actually is displayed in the window. However this
is not currently done.

At some point I will need to take another look at the whole syntax stuff,
it could probably be made a bit more efficient. For example the rules for
multiline comments should be the first to test such that all others are
skipped once a comment is found.
Post by Andrew Hills
Post by Marc André Tanner
The repeat command '.' currently only works for operators. This for
example means that inserting text can not be repeated (i.e. inserted
again). The same restriction also applies to commands which are not
implemented in terms of operators, such as 'o', 'O', 'J' etc.
Is this intended, or is the vim-like behavior planned?
It is just a result of the current implementation. It would be possible to
promote these commands to self contained operators which would make them
repeatable. I don't know yet what the best way is.
Post by Andrew Hills
Post by Marc André Tanner
+ code completion: this should be done as an external process. I will
have to take a look at the tools from the llvm / clang project. Maybe
dvtm's terminal emulation support could be reused to display an
slmenu inside the editor at the cursor position?
This feature seems unnecessary; do others use this? The last time I had
code completion was in Eclipse, when I was younger and more foolish, and
I don't miss it at all.
Post by Marc André Tanner
- macro recording
Macros are one of my most-used Vim features; they are very useful for
repetitive editing of complex files where regular expressions are more
of a pain. Even if you have no desire for them, would you accept patches
to add the feature, or should this list be considered a blacklist?
Yes I will certainly evaluate patches which add other kind of functionality.
It is more a list of stuff which I personally don't (currently?) use/need
and therefore have no particular interest in.

On a technical level for this to work, you would also have to repeat
past commands but to be honest I don't even know what is possible with
macros.
--
Marc André Tanner >< http://www.brain-dump.org/ >< GPG key: CF7D56C0
Silvan Jegen
2014-09-13 15:25:32 UTC
Permalink
Post by Marc André Tanner
TLDR: I'm writing an experimental but (hopefully) highly efficient vim
like text editor based on a piece chain data structure. You will find
an url to a git repository at the end of this rather long mail.
I tested the editor briefly and it looks really promising, thanks for
the hard work! Vis looks like something that I hoped neovim would evolve
into eventually...
Post by Marc André Tanner
There is currently no distinction between what vim calls a WORD and
a word, only the former is implemented. Though infrastructure for
the latter also exists.
The semantics of a paragraph and a sentence is also not always 100%
the same as in vim.
Some of these commands do not work as in vim when prefixed with a
digit i.e. a multiplier. As an example 3$ should move to the end
of the 3rd line down. The way it currently behaves is that the first
movement places the cursor at the end of the current line and the last
two have thus no effect.
In general there are still a lot of improvements to be made in the
case movements are forced to be line or character wise. Also some of
them should be inclusive in some context and exclusive in others.
At the moment they always behave the same.
In my brief testing I found that H, M, L (in Vim: moving the cursor to
the first, middle or last line of a window) are not implemented and that
G does not move the cursor to the end of the file yet.

In general, are patches that modify vis' cursor movement or operator
behaviour to be closer to that of Vim welcome?


Cheers,

Silvan
Marc André Tanner
2014-09-13 16:48:15 UTC
Permalink
Post by Silvan Jegen
In my brief testing I found that H, M, L (in Vim: moving the cursor to
the first, middle or last line of a window) are not implemented
They were easy enough to add, should work now.
Post by Silvan Jegen
and that G does not move the cursor to the end of the file yet.
This should now also work. By default the multiplier is 1 which means
G and 1G are the same which they should not be in this case.
Post by Silvan Jegen
In general, are patches that modify vis' cursor movement or operator
behaviour to be closer to that of Vim welcome?
Yes of course! They will be evaluated on an individual basis.
--
Marc André Tanner >< http://www.brain-dump.org/ >< GPG key: CF7D56C0
Andrew Hills
2014-09-13 16:15:55 UTC
Permalink
FYI, anyone building on OpenBSD (and possibly other BSDs), add
-D_BSD_SOURCE to CFLAGS in config.mk for SIGWINCH.
Markus Teich
2014-09-13 16:58:29 UTC
Permalink
For word, sentence and paragraph there is no difference between the inner
and normal variants.
Heyho Marc,

there should be a difference. For example you normaly want to use the inner
variant with the change operator and the normal (outer) variant with the delete
operator. To make the meaning of inner/outer dependent on the operator would be
inconsistent.

--Markus
Marc André Tanner
2014-09-14 09:42:37 UTC
Permalink
Post by Marc André Tanner
TLDR: I'm writing an experimental but (hopefully) highly efficient vim
like text editor based on a piece chain data structure. You will find
an url to a git repository at the end of this rather long mail.
Hallo Marc,
that sounds really interesting.
I have to admit that I have not (yet) read through all of your mail. I gave
it a test, though. Compiles fine, opening /etc/fstab and playing looks good
so far. Saving to /tmp (:w /tmp/fstab) fails though: It tries to open
"./tmp/fstab.tmp" (note the leading dot).
Saves to absolute paths should now work. The current approach is to
save the file to "filename~" then rename(2)-it to its final destination.

There is still some work left to be done to make sure data is not
visible to unauthorized eyes. The file permissions / ownership should
be restored if possible. Symlinks are currently not handled correctly
(i.e. they are broken after a save) etc.
--
Marc André Tanner >< http://www.brain-dump.org/ >< GPG key: CF7D56C0
Allan
2014-09-14 12:10:40 UTC
Permalink
  Original Message  
From: Marc André Tanner
Sent: Sunday, 14 September 2014 10:43
To: Christian Hesse
Reply To: dev mail list
Cc: ***@suckless.org
Subject: Re: [dev] [RFC] Design of a vim like text editor
Post by Marc André Tanner
TLDR: I'm writing an experimental but (hopefully) highly efficient vim
like text editor based on a piece chain data structure. You will find
an url to a git repository at the end of this rather long mail.
Hallo Marc,
that sounds really interesting.
I have to admit that I have not (yet) read through all of your mail. I gave
it a test, though. Compiles fine, opening /etc/fstab and playing looks good
so far. Saving to /tmp (:w /tmp/fstab) fails though: It tries to open
"./tmp/fstab.tmp" (note the leading dot).
Saves to absolute paths should now work. The current approach is to
save the file to "filename~" then rename(2)-it to its final destination.

There is still some work left to be done to make sure data is not
visible to unauthorized eyes. The file permissions / ownership should
be restored if possible. Symlinks are currently not handled correctly
(i.e. they are broken after a save) etc.
--
Marc André Tanner >< http://www.brain-dump.org/ >< GPG key: CF7D56C0
M Farkas-Dyck
2014-09-15 02:02:30 UTC
Permalink
The default interface is a vim clone called vis.
Name clash on BSD [1][2][3]

[1] https://www.freebsd.org/cgi/man.cgi?query=vis&apropos=0&sektion=0&manpath=FreeBSD+10.0-RELEASE&arch=default&format=html
[2] http://www.openbsd.org/cgi-bin/man.cgi?query=vis
[3] http://netbsd.gw.com/cgi-bin/man-cgi?vis++NetBSD-current
Marc André Tanner
2014-09-15 16:59:36 UTC
Permalink
Post by M Farkas-Dyck
The default interface is a vim clone called vis.
Name clash on BSD [1][2][3]
This is somewhat unfortunate but the opportunity was too good
not to make a reference to vis[0] (due to the original inspiration
from Wirth's Project Oberon and the fact they support the development
with countless free coffees).

I guess it could be changed to something else though.

[0] http://www.vis.ethz.ch/en/about/
--
Marc André Tanner >< http://www.brain-dump.org/ >< GPG key: CF7D56C0
Dimitris Zervas
2014-09-15 07:49:09 UTC
Permalink
Post by Marc André Tanner
TLDR: I'm writing an experimental but (hopefully) highly efficient vim
like text editor based on a piece chain data structure. You will find
an url to a git repository at the end of this rather long mail.
Take a look at sandy [0], it is a suckless vim-like text editor.
We haven't paid much attention at the text management algorithm, but I think that your algorithm is not that difficult to implement.
Apart from that, we've implemented all the features you pointed, apart from auto completion.

Have fun coding! :)

[0]: http://git.suckless.org/sandy
Christian Hesse
2014-09-15 08:05:21 UTC
Permalink
On September 13, 2014 5:01:15 PM EEST, "Marc André Tanner"
Post by Marc André Tanner
TLDR: I'm writing an experimental but (hopefully) highly efficient vim
like text editor based on a piece chain data structure. You will find
an url to a git repository at the end of this rather long mail.
Take a look at sandy [0], it is a suckless vim-like text editor.
We haven't paid much attention at the text management algorithm, but I
think that your algorithm is not that difficult to implement. Apart from
that, we've implemented all the features you pointed, apart from auto
completion.
Have fun coding! :)
[0]: http://git.suckless.org/sandy
sandy is a nice editor, but not vim-like, no? Looking at the man page it
looks similar to nano and friends.

I did test it some time ago, but did not feed familiar with it. (In contrast
to vis, which works as expected for a vim-like editor.)
--
main(a){char*c=/* Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/* Chris get my mail address: */=0;b=c[a++];)
putchar(b-1/(/* gcc -o sig sig.c && ./sig */b/42*2-3)*42);}
Marc André Tanner
2014-09-15 17:16:53 UTC
Permalink
Post by Dimitris Zervas
Post by Marc André Tanner
TLDR: I'm writing an experimental but (hopefully) highly efficient vim
like text editor based on a piece chain data structure. You will find
an url to a git repository at the end of this rather long mail.
Take a look at sandy [0], it is a suckless vim-like text editor.
I'm well aware of sandy, it is a fine editor! I use some of the same
ideas for example the syntax highlighting. One of the goals of vis to
have a clean separation between frontend and backend code, it is therefore
perfectly possible to implement a sandy like interface on top of it.
Post by Dimitris Zervas
We haven't paid much attention at the text management algorithm, but
I think that your algorithm is not that difficult to implement.
I'm not sure it is that easy with the current sandy codebase.
Post by Dimitris Zervas
Apart from that, we've implemented all the features you pointed,
apart from auto completion.
I don't think so, try to edit something like this:

コンニチハ

or this:

printf "Hello\0World" > TEST && sandy TEST

or this (not recommended):

seq 10000000 > TEST && sandy TEST

Last time I looked the vim bindings weren't that powerful, for example
text object are not supported? As an example something like this in a
nested code block: c2i}
Post by Dimitris Zervas
Have fun coding! :)
I have, thanks!
--
Marc André Tanner >< http://www.brain-dump.org/ >< GPG key: CF7D56C0
Dimitris Zervas
2014-09-15 20:37:31 UTC
Permalink
Post by Marc André Tanner
コンニチハ
printf "Hello\0World" > TEST && sandy TEST
seq 10000000 > TEST && sandy TEST
Last time I looked the vim bindings weren't that powerful, for example
text object are not supported? As an example something like this in a
nested code block: c2i}
Lol, yes you're right. In every example I was like oops, that's a bug :P
Post by Marc André Tanner
seq 10000000 > TEST && sandy TEST
what will happen? (can't test it right now)

So, you suggest we drop sandy?
I'm not telling it "in a bad way", but your project seems (from the comments, I haven't tried it yet) very promising and more worth coding than sandy.
Is that right? (question to sandy users too).
I want to code a text editor that is suckless and actually matters. If your implementation is better, we better code for your project.

Cheers! :)
Teodoro Santoni
2014-09-15 21:16:44 UTC
Permalink
Post by Dimitris Zervas
I'm not telling it "in a bad way", but your project seems (from the
comments, I haven't tried it yet) very promising and more worth coding than
sandy.
Post by Dimitris Zervas
Is that right? (question to sandy users too).
I want to code a text editor that is suckless and actually matters. If your
implementation is better, we better code for your project.
I would use an editor for hacks, write mails, notes, my diary implemented in
sh, rapid touches on things i've written on the IDE the day before.
And an editor for coding, with colors, keybinds/scripts.
For the former I'd prefer to contribute and upgrade to sandy with vi keys over
nvi (which I actually use 'cause... dunno why, one day I quit nano), for the
latter I use vim, and hope to contribute and upgrade to ${video editor
improved, based on vi improved, but suckless, an idea of the dvtm dev}.


--
Teodoro Santoni
Dimitris Papastamos
2014-09-15 21:52:32 UTC
Permalink
Post by Dimitris Zervas
I want to code a text editor that is suckless and actually matters. If your implementation is better, we better code for your project.
FWIW, the only reason I liked sandy was because it was not yet another
vi clone.

A nice experimental and in-development editor is edit[0]

[0] http://c9x.me/edit/
Lee Fallat
2014-09-15 22:02:19 UTC
Permalink
Post by Dimitris Papastamos
Post by Dimitris Zervas
I want to code a text editor that is suckless and actually matters. If your implementation is better, we better code for your project.
FWIW, the only reason I liked sandy was because it was not yet another
vi clone.
A nice experimental and in-development editor is edit[0]
[0] http://c9x.me/edit/
Whoa. Yes. Now this is an editor worth getting interested in.

I've sort've created what that guy has, but favored key combos for
editor commands rather than clickable keywords. Same with the command
window- I've opted for a command line and use the text view for all
stderr/out/in.

Is this guy using the p9p graphics lib?
Dimitris Papastamos
2014-09-15 22:13:19 UTC
Permalink
Post by Lee Fallat
Post by Dimitris Papastamos
Post by Dimitris Zervas
I want to code a text editor that is suckless and actually matters. If your implementation is better, we better code for your project.
FWIW, the only reason I liked sandy was because it was not yet another
vi clone.
A nice experimental and in-development editor is edit[0]
[0] http://c9x.me/edit/
Whoa. Yes. Now this is an editor worth getting interested in.
I've sort've created what that guy has, but favored key combos for
editor commands rather than clickable keywords. Same with the command
window- I've opted for a command line and use the text view for all
stderr/out/in.
Is this guy using the p9p graphics lib?
I think it is Xlib :)
Lee Fallat
2014-09-15 22:19:37 UTC
Permalink
Post by Dimitris Papastamos
Post by Lee Fallat
Post by Dimitris Papastamos
Post by Dimitris Zervas
I want to code a text editor that is suckless and actually matters. If your implementation is better, we better code for your project.
FWIW, the only reason I liked sandy was because it was not yet another
vi clone.
A nice experimental and in-development editor is edit[0]
[0] http://c9x.me/edit/
Whoa. Yes. Now this is an editor worth getting interested in.
I've sort've created what that guy has, but favored key combos for
editor commands rather than clickable keywords. Same with the command
window- I've opted for a command line and use the text view for all
stderr/out/in.
Is this guy using the p9p graphics lib?
I think it is Xlib :)
You are right. Checked out the code. Very similar to Acme.

I'll take this opportunity to get this thread back on track:
How difficult would it be to go back to the original sockets idea? Is
vis too far gone? I'd like to be able to edit several open buffers
from a single command line.
Raphaël Proust
2014-09-16 09:35:26 UTC
Permalink
Post by Dimitris Papastamos
A nice experimental and in-development editor is edit[0]
Really nice!

I am planning on making a filter for the sam language. I.e. something
like sed but that would accept sam expressions. That would make a
search and replace trivial in your editor: |sampipe
'x/line/c/sentence'. (Actually the sam structural regexp is the thing
I miss the most when using vi(m).)

I'll definitely give a try to your editor!


Cheers,
--
______________
Raphaël Proust
M Farkas-Dyck
2014-09-16 11:19:03 UTC
Permalink
Post by Raphaël Proust
I am planning on making a filter for the sam language. I.e. something
like sed but that would accept sam expressions.
http://swtch.com/plan9port/man/man1/ssam.html
Raphaël Proust
2014-09-16 12:59:01 UTC
Permalink
Post by M Farkas-Dyck
Post by Raphaël Proust
I am planning on making a filter for the sam language. I.e. something
like sed but that would accept sam expressions.
http://swtch.com/plan9port/man/man1/ssam.html
Thanks, I didn't know about that…

I also want to fix a few things in the sam semantics, especially
regarding {} enclosed list of commands. But in the mean time, ssam
will be very useful.


Ciao,
--
______________
Raphaël Proust
Maxime Coste
2014-09-15 13:21:25 UTC
Permalink
Hello,

Here are a few thought on your design, based on my own experience with Kakoune.
Post by Marc André Tanner
Text management using a piece table/chain
=========================================
[...]
While this looks like a nice data structure for editing arbitrary byte
string, you can get much better actual performances if you decide you write
a text/code editor.

regular text is naturally line/column oriented, and storing it in the form of
a dynamic array of lines (with lines being simple strings) works very well and
gives excellent performance once you use (line, column) pairs to reference it.

In practice your user think about text in this line column fashion, which
implies that your text editing will stay mostly line column centric, so
things ends up much simpler when the editing backend itself is matching that.

That said, this is limited to actual text, arbitrary byte sequences do not
map well to this, in which case your piece table seems nice.
Post by Marc André Tanner
Screen Drawing
==============
[...]
Window-Management
-----------------
In principle it would be nice to follow a similar client/server approach
as sam/samterm i.e. having the main editor as a server and each window
as a separate client process with communication over a unix domain socket.
[...]
The client server thing can stay quite simple if you avoid any synchronisation.
in Kakoune once the connection is done, the client sends keystrokes,
and the server sends display commands. Once you have your poll event loop
(which you will endup having if you want to handle anything asynchrounously)
this integrate very easily.
Post by Marc André Tanner
Editor Frontends
================
vis a vim like frontend
-----------------------
[...]
So it seems you are basically targetting very close to vi interface, I am
always a little sad to see new editors doing that. vi and vim have tons of
good ideas in them, but the editing model has a lot of room for improvement
to get a more consistent and regular interface.

Kakoune is one direction, integrating multi-selection and focusing on
interactive edition, which gave very good results in term of keystrokes count
(it beats vim on several vimgolf challenges). I expect there are lots of
alternatives directions to improve the vi-like user interface, and trying
to improve the implementation without trying to improve the design itself
seems like a waste.


Anyway, best of luck on your project, writing a code editor is a very
rewarding experience.

Cheers,

Maxime Coste.
q***@c9x.me
2014-09-15 15:41:29 UTC
Permalink
Post by Maxime Coste
Hello,
[...]
Maxime Coste.
I like your advertisement man, keep it up :).
I also like advocating for change rather than
flavorless copy.

- Q.
Dimitris Zervas
2014-09-15 20:50:52 UTC
Permalink
Post by q***@c9x.me
Post by Maxime Coste
Hello,
[...]
Maxime Coste.
I like your advertisement man, keep it up :).
I also like advocating for change rather than
flavorless copy.
- Q.
Yes, I agree.
I also like how you post a C++ project with trillions of files, libraries, dependencies and weird black magic, in a suckLESS list, over a hundred of time.

sorry if I sound mean,
Cheers.
Maxime Coste
2014-09-15 23:12:58 UTC
Permalink
Hi
Post by Dimitris Zervas
Post by q***@c9x.me
Post by Maxime Coste
Hello,
[...]
Maxime Coste.
I like your advertisement man, keep it up :).
I also like advocating for change rather than
flavorless copy.
- Q.
Yes, I agree.
I also like how you post a C++ project with trillions of files, libraries, dependencies and weird black magic, in a suckLESS list, over a hundred of time.
Hard to resist talking about a project I spent most of my spare coding time
on for the last 3 years.

I agree 18000 is not small, but not that big either, clearly reasonable
for the amount of things Kakoune does (and nope, I do not want to replace a
20 instructions function with a fork-exec to use 'tr' to replace tabs with
spaces). 2 hard dependencies (boost and ncurses) with boost getting away once
we get more widely available c++11 regexes is not unreasonable either. For
weird black magic, not sure what you are talking about.

And for C++, well, I know there is some vocal individuals against it on the
sl mailing list, but I think most members are sensible, we do not need to
stay frozen with C89, C++ is bigger than C, more complex, but provides a lot
of abstraction features that makes it easier to reason and organize your
program. I suspect most of the hatred I see here is due to ignorance, low
familiarity with idiomatic C++ and exposure to horrible code (horrible
code being writable in any language).

So yeah, posted about that on the suckless mailing list, which I have been
following for a long time, and whose philosophy I mostly share (and kept in
mind while designing Kakoune).

All that to say, yep, thats maybe the third thread talking about text editors
where I answer referencing my personal code editor, sorry if I made you
feel overwhelmed.
Post by Dimitris Zervas
sorry if I sound mean,
No worries, that is the internet, we all sound mean.

Cheers,

Maxime Coste.
FRIGN
2014-09-15 23:23:07 UTC
Permalink
On Tue, 16 Sep 2014 00:12:58 +0100
Post by Maxime Coste
Hard to resist talking about a project I spent most of my spare coding time
on for the last 3 years.
Fair point.
Post by Maxime Coste
I agree 18000 is not small, but not that big either, clearly reasonable
for the amount of things Kakoune does (and nope, I do not want to replace a
20 instructions function with a fork-exec to use 'tr' to replace tabs with
spaces). 2 hard dependencies (boost and ncurses) with boost getting away once
we get more widely available c++11 regexes is not unreasonable either. For
weird black magic, not sure what you are talking about.
Boost == cancer.
Post by Maxime Coste
And for C++, well, I know there is some vocal individuals against it on the
sl mailing list, but I think most members are sensible, we do not need to
stay frozen with C89, C++ is bigger than C, more complex, but provides a lot
of abstraction features that makes it easier to reason and organize your
program. I suspect most of the hatred I see here is due to ignorance, low
familiarity with idiomatic C++ and exposure to horrible code (horrible
code being writable in any language).
Some vocal individuals? I struggle to find anybody who isn't against C++.
C++ does provide some abstraction features, yes, but every time I read
C++ or even boost-code, my brain just shuts off and begs for C's
simplicity and clarity.

Cheers

FRIGN
--
FRIGN <***@frign.de>
Ralph Eastwood
2014-09-16 08:13:00 UTC
Permalink
Post by FRIGN
Some vocal individuals? I struggle to find anybody who isn't against C++.
C++ does provide some abstraction features, yes, but every time I read
C++ or even boost-code, my brain just shuts off and begs for C's
simplicity and clarity.
In my opinion, C++'s good features are also it's bad features; the
metaprogramming
abilities of C++ are one of the two reasons to use C++. The other
being is if you need
to get some code done and the only available library is in C++ (and
you have no time
to reimplement that library).

If you can get your head round the metaprogramming, C++ provides some
powerful syntactic sugar that is really useful for abstracting
complex pieces of code (e.g. say a maths i.e. matrix heavy code). The
downside is you
can also use it to hide the complicated bits in tonnes of
metaprogramming which is almost
unintelligible too anyone but yourself and a few metaprogramming gurus.
Post by FRIGN
Post by Maxime Coste
And for C++, well, I know there is some vocal individuals against it on the
sl mailing list, but I think most members are sensible, we do not need to
stay frozen with C89, C++ is bigger than C, more complex, but provides a lot
of abstraction features that makes it easier to reason and organize your
program. I suspect most of the hatred I see here is due to ignorance, low
familiarity with idiomatic C++ and exposure to horrible code (horrible
code being writable in any language).
Maxime, I really like what you'e done with kakoune, although your code
base doesn't
seem to use C++'s features heavily, meaning that your could write
equally clean code in C.
Why does it have to be C89? C99 is nicer.


For a text editor, C is perfectly adequate as the main operations are
to do with text and you don't
really need to worry about a lot of abstractions.
--
Tai Chi Minh Ralph Eastwood
***@gmail.com
Maxime Coste
2014-09-16 19:09:13 UTC
Permalink
Post by Ralph Eastwood
Maxime, I really like what you'e done with kakoune, although your code
base doesn't
seem to use C++'s features heavily, meaning that your could write
equally clean code in C.
Why does it have to be C89? C99 is nicer.
For a text editor, C is perfectly adequate as the main operations are
to do with text and you don't
really need to worry about a lot of abstractions.
Well, I still rely a lot on having a proper type system to do checks at
compile time. For example I have separate types for line counts, byte
counts and char counts, so I get a compilatin error if I try to add
bytes and char, while still getting the conveniences of using operators,
and with code that compiles down to plain int operations.

Another example would be safe_ptr<T>, pointers that compiles to raw pointers
in optimized builds, but that guarantee that the pointed to object is alive
(you get an assert if an object dies when a safe_ptr to them still exists).
It stills behave like a pointer, but allows me to both document, and get
debug checks, that the pointed to object should by design outlive this pointer.

C++ is not the most elegant language, but there is nothing better available
IMHO. C89's minimalism is attractive, but no overloading, no generics, and
weak type system makes it harder than necessary to manage complexity. And
modern C (99 and 11) does have its own ugly quirks (the magically overloaded
tgmath.h functions, 'complex' builtin type...).

A common symptom of C's lacking abstraction facilities is the reliance on
linked lists as the most common list data structure. As it is so easy to
implement you add do it quickly for any struct, when in practice a dynamic
array provides you with much better performance characteristics.

Cheers,

Maxime Coste.
Roberto E. Vargas Caballero
2014-09-16 20:02:54 UTC
Permalink
Post by Maxime Coste
Well, I still rely a lot on having a proper type system to do checks at
compile time. For example I have separate types for line counts, byte
Proper..., good word for saying a pain in the ass. One of the best thing
of C (and C++ because they share this part) is automatic conversions
that remove this work of the programmer. If you have problems
with them maybe you should learn a bit more.
Post by Maxime Coste
counts and char counts, so I get a compilatin error if I try to add
bytes and char, while still getting the conveniences of using operators,
and with code that compiles down to plain int operations.
Operator overloading, one of the worst things of C++ (well, there are
so much of them that is only more of them). It only is useful to make
obfuscated code.
Post by Maxime Coste
Another example would be safe_ptr<T>, pointers that compiles to raw pointers
in optimized builds, but that guarantee that the pointed to object is alive
(you get an assert if an object dies when a safe_ptr to them still exists).
It stills behave like a pointer, but allows me to both document, and get
debug checks, that the pointed to object should by design outlive this pointer.
I don't need this damaged brain pointers, again if you have this kind
of problems you should learn a bit more. Like a friend of mine says,
code in C is like sex, you have to know what you are doing.
Post by Maxime Coste
C++ is not the most elegant language, but there is nothing better available
IMHO. C89's minimalism is attractive, but no overloading, no generics, and
weak type system makes it harder than necessary to manage complexity. And
modern C (99 and 11) does have its own ugly quirks (the magically overloaded
tgmath.h functions, 'complex' builtin type...).
No generic is a feature. Generic are very stupid idea that only creates
blown binaries (this is one of the point I don't like about C11). Also,
the compexity of generics in a lenguage with automatic conversions
like C (and C++) is too much.
Post by Maxime Coste
A common symptom of C's lacking abstraction facilities is the reliance on
linked lists as the most common list data structure. As it is so easy to
implement you add do it quickly for any struct, when in practice a dynamic
array provides you with much better performance characteristics.
Depend. Basically depend of the size and the operations you do in it.
I suppose you know that inserting in a dynamic array is O(n^2), and of
course, searches in an unordered array is O(n), while inserting
in the head of a list is O(k). There is a limit in the size where almost
operations in dynamic arrays are slower, and in a lot of times with this
small sizes you can use overdimensioned arrays that are far faster
than dynamic arrays.

Stop this C++ proselytism because the only thing you are going to
get is becoming a troll. If you like maschoshism is your problem, but
please don't tell to us.

Regards,
--
Roberto E. Vargas Caballero
Maxime Coste
2014-09-16 20:30:47 UTC
Permalink
Post by Roberto E. Vargas Caballero
Post by Maxime Coste
Well, I still rely a lot on having a proper type system to do checks at
compile time. For example I have separate types for line counts, byte
Proper..., good word for saying a pain in the ass. One of the best thing
of C (and C++ because they share this part) is automatic conversions
that remove this work of the programmer. If you have problems
with them maybe you should learn a bit more.
Ok, so what exactly is the sum of 3 lines and 2 bytes ? The whole point
is to catch at compilation code that is logically invalid, if you have
f(ByteCount, LineCount), you cannot call it with a (LineCount, ByteCount)
signature. In C you would be forced to use f(int, int), and long debugging
sessions to discover this simple mistake.
Post by Roberto E. Vargas Caballero
Post by Maxime Coste
counts and char counts, so I get a compilatin error if I try to add
bytes and char, while still getting the conveniences of using operators,
and with code that compiles down to plain int operations.
Operator overloading, one of the worst things of C++ (well, there are
so much of them that is only more of them). It only is useful to make
obfuscated code.
It just gives you tool so that you can write your code closed to the domain
language, did you learn linear algebra writing matrix_add(m1, m2, &m3)
? or m3 = m1 + m2 ?
Post by Roberto E. Vargas Caballero
Post by Maxime Coste
Another example would be safe_ptr<T>, pointers that compiles to raw pointers
in optimized builds, but that guarantee that the pointed to object is alive
(you get an assert if an object dies when a safe_ptr to them still exists).
It stills behave like a pointer, but allows me to both document, and get
debug checks, that the pointed to object should by design outlive this pointer.
I don't need this damaged brain pointers, again if you have this kind
of problems you should learn a bit more. Like a friend of mine says,
code in C is like sex, you have to know what you are doing.
You probably never worked on complicated enough code bases if you believe every
program fits entirely in your brain. Putting safe guards in you code asserting
that what you believe is correct actually is is necessary if you want to keep
your sanity.
Post by Roberto E. Vargas Caballero
Post by Maxime Coste
C++ is not the most elegant language, but there is nothing better available
IMHO. C89's minimalism is attractive, but no overloading, no generics, and
weak type system makes it harder than necessary to manage complexity. And
modern C (99 and 11) does have its own ugly quirks (the magically overloaded
tgmath.h functions, 'complex' builtin type...).
No generic is a feature. Generic are very stupid idea that only creates
blown binaries (this is one of the point I don't like about C11). Also,
the compexity of generics in a lenguage with automatic conversions
like C (and C++) is too much.
what is silly is rewriting the same function with different arguments again
and again. Or ending up relying on macros to emulate generics.
Post by Roberto E. Vargas Caballero
Post by Maxime Coste
A common symptom of C's lacking abstraction facilities is the reliance on
linked lists as the most common list data structure. As it is so easy to
implement you add do it quickly for any struct, when in practice a dynamic
array provides you with much better performance characteristics.
Depend. Basically depend of the size and the operations you do in it.
I suppose you know that inserting in a dynamic array is O(n^2), and of
course, searches in an unordered array is O(n), while inserting
in the head of a list is O(k). There is a limit in the size where almost
operations in dynamic arrays are slower, and in a lot of times with this
small sizes you can use overdimensioned arrays that are far faster
than dynamic arrays.
Get your complexity right, inserting in a dynamic array is O(n), the eventual
need for an allocation is amortized (whereas you always end up doing a malloc
for your linked lists). Another thing you should look up is modern cpu
architectures and caches, in practice the much better locality of reference
of arrays makes them *way* better on operation like insert/erase in the middle
than lists even though complexity theory says otherwise. (Remember, complexities
are asymptotic, you need huuuuuge number of elements).
Post by Roberto E. Vargas Caballero
Stop this C++ proselytism because the only thing you are going to
get is becoming a troll. If you like maschoshism is your problem, but
please don't tell to us.
Can't we have a civilized discussion ?

Cheers,

Maxime Coste.
Christoph Lohmann
2014-09-16 21:02:40 UTC
Permalink
Greetings.
Post by Maxime Coste
Post by Roberto E. Vargas Caballero
Post by Maxime Coste
Well, I still rely a lot on having a proper type system to do checks at
compile time. For example I have separate types for line counts, byte
Proper..., good word for saying a pain in the ass. One of the best thing
of C (and C++ because they share this part) is automatic conversions
that remove this work of the programmer. If you have problems
with them maybe you should learn a bit more.
Ok, so what exactly is the sum of 3 lines and 2 bytes ? The whole point
is to catch at compilation code that is logically invalid, if you have
f(ByteCount, LineCount), you cannot call it with a (LineCount, ByteCount)
signature. In C you would be forced to use f(int, int), and long debugging
sessions to discover this simple mistake.
Then write better and more logic access functions instead of your object
abstraction. It keeps your mind simpler and speeds up your code.
Post by Maxime Coste
Post by Roberto E. Vargas Caballero
Post by Maxime Coste
counts and char counts, so I get a compilatin error if I try to add
bytes and char, while still getting the conveniences of using operators,
and with code that compiles down to plain int operations.
Operator overloading, one of the worst things of C++ (well, there are
so much of them that is only more of them). It only is useful to make
obfuscated code.
It just gives you tool so that you can write your code closed to the domain
language, did you learn linear algebra writing matrix_add(m1, m2, &m3)
? or m3 = m1 + m2 ?
This is programming and not your playground. Avoid fancy code.
Post by Maxime Coste
Post by Roberto E. Vargas Caballero
Post by Maxime Coste
Another example would be safe_ptr<T>, pointers that compiles to raw pointers
in optimized builds, but that guarantee that the pointed to object is alive
(you get an assert if an object dies when a safe_ptr to them still exists).
It stills behave like a pointer, but allows me to both document, and get
debug checks, that the pointed to object should by design outlive this pointer.
I don't need this damaged brain pointers, again if you have this kind
of problems you should learn a bit more. Like a friend of mine says,
code in C is like sex, you have to know what you are doing.
You probably never worked on complicated enough code bases if you believe every
program fits entirely in your brain. Putting safe guards in you code asserting
that what you believe is correct actually is is necessary if you want to keep
your sanity.
Suckless is about writing simple code bases. When your choice for C++ is
because you want to write complex code bases then you are in the wrong
community and should leave as fast as you can.

By adding constraints on the hidden complexity it is by cause made hard‐
er to write useless abstraction code.
Post by Maxime Coste
Post by Roberto E. Vargas Caballero
Post by Maxime Coste
C++ is not the most elegant language, but there is nothing better available
IMHO. C89's minimalism is attractive, but no overloading, no generics, and
weak type system makes it harder than necessary to manage complexity. And
modern C (99 and 11) does have its own ugly quirks (the magically overloaded
tgmath.h functions, 'complex' builtin type...).
No generic is a feature. Generic are very stupid idea that only creates
blown binaries (this is one of the point I don't like about C11). Also,
the compexity of generics in a lenguage with automatic conversions
like C (and C++) is too much.
what is silly is rewriting the same function with different arguments again
and again. Or ending up relying on macros to emulate generics.
Learn to code in C.
Post by Maxime Coste
Post by Roberto E. Vargas Caballero
Post by Maxime Coste
A common symptom of C's lacking abstraction facilities is the reliance on
linked lists as the most common list data structure. As it is so easy to
implement you add do it quickly for any struct, when in practice a dynamic
array provides you with much better performance characteristics.
Depend. Basically depend of the size and the operations you do in it.
I suppose you know that inserting in a dynamic array is O(n^2), and of
course, searches in an unordered array is O(n), while inserting
in the head of a list is O(k). There is a limit in the size where almost
operations in dynamic arrays are slower, and in a lot of times with this
small sizes you can use overdimensioned arrays that are far faster
than dynamic arrays.
Get your complexity right, inserting in a dynamic array is O(n), the eventual
need for an allocation is amortized (whereas you always end up doing a malloc
for your linked lists). Another thing you should look up is modern cpu
architectures and caches, in practice the much better locality of reference
of arrays makes them *way* better on operation like insert/erase in the middle
than lists even though complexity theory says otherwise. (Remember, complexities
are asymptotic, you need huuuuuge number of elements).
When you work close to the metal much of your theory can be optimized
out. I won’t tell you how.
Post by Maxime Coste
Post by Roberto E. Vargas Caballero
Stop this C++ proselytism because the only thing you are going to
get is becoming a troll. If you like maschoshism is your problem, but
please don't tell to us.
Can't we have a civilized discussion ?
No, civilisation ended when »C++« was mentioned. Code abstraction and
bad design choices made from idiots relying on OOP are the reason why
your local Windows machine is so slow in loading drivers, opening Win‐
dows, loading the help file in the regular Java NULL pointer exception
or simply loading and displaying text in a webbrowser.
This can only be avoided by changing and restricting the system.


Sincerely,

Christoph Lohmann
FRIGN
2014-09-16 21:20:34 UTC
Permalink
On Tue, 16 Sep 2014 23:02:40 +0200
Post by Christoph Lohmann
No, civilisation ended when »C++« was mentioned. Code abstraction and
bad design choices made from idiots relying on OOP are the reason why
your local Windows machine is so slow in loading drivers, opening Win‐
dows, loading the help file in the regular Java NULL pointer exception
or simply loading and displaying text in a webbrowser.
This can only be avoided by changing and restricting the system.
I sign this.

Makes you wonder why people don't question the fact that with each iteration
of a new operating system and set of software released they are required
to aquire new hardware to handle it.
--
FRIGN <***@frign.de>
Maxime Coste
2014-09-16 23:38:27 UTC
Permalink
Post by Christoph Lohmann
Post by Maxime Coste
Ok, so what exactly is the sum of 3 lines and 2 bytes ? The whole point
is to catch at compilation code that is logically invalid, if you have
f(ByteCount, LineCount), you cannot call it with a (LineCount, ByteCount)
signature. In C you would be forced to use f(int, int), and long debugging
sessions to discover this simple mistake.
Then write better and more logic access functions instead of your object
abstraction. It keeps your mind simpler and speeds up your code.
These are not objects, just ints, its an implementation detail that I use a
a class to get segregated types and to specify exactly what can add to what.
On any platform with a decent call ABI, this compiles to the same assembly,
the only thing being a nice error message if I try to do something senseless.
Post by Christoph Lohmann
Post by Maxime Coste
It just gives you tool so that you can write your code closed to the domain
language, did you learn linear algebra writing matrix_add(m1, m2, &m3)
? or m3 = m1 + m2 ?
This is programming and not your playground. Avoid fancy code.
I guess that is a matter of taste, I just know m1 + m2 calls operator+(Matrix, Matrix).
Post by Christoph Lohmann
Suckless is about writing simple code bases. When your choice for C++ is
because you want to write complex code bases then you are in the wrong
community and should leave as fast as you can.
By adding constraints on the hidden complexity it is by cause made hard‐
er to write useless abstraction code.
You cannot always simplify the problem to keep a simple solution, sometimes the
problem *is* complex. I put safe guards in the code, especially when the have
zero cost in release, because I have debugged enough clever C like code to
prefer a compilation error or an assert failing as soon as possible to a long
session in gdb.
Post by Christoph Lohmann
Post by Maxime Coste
Post by Roberto E. Vargas Caballero
No generic is a feature. Generic are very stupid idea that only creates
blown binaries (this is one of the point I don't like about C11). Also,
the compexity of generics in a lenguage with automatic conversions
like C (and C++) is too much.
what is silly is rewriting the same function with different arguments again
and again. Or ending up relying on macros to emulate generics.
Learn to code in C.
What is your strategy in C when you need to apply the same logic on different types ?
Post by Christoph Lohmann
Post by Maxime Coste
Get your complexity right, inserting in a dynamic array is O(n), the eventual
need for an allocation is amortized (whereas you always end up doing a malloc
for your linked lists). Another thing you should look up is modern cpu
architectures and caches, in practice the much better locality of reference
of arrays makes them *way* better on operation like insert/erase in the middle
than lists even though complexity theory says otherwise. (Remember, complexities
are asymptotic, you need huuuuuge number of elements).
When you work close to the metal much of your theory can be optimized
out. I won’t tell you how.
The best you can do is store your linked list in an array, and sort it at certain
point so that elements end up linearly in memory. But when you've done that you
already have a dynamic array implementation.
Post by Christoph Lohmann
No, civilisation ended when »C++« was mentioned. Code abstraction and
bad design choices made from idiots relying on OOP are the reason why
your local Windows machine is so slow in loading drivers, opening Win‐
dows, loading the help file in the regular Java NULL pointer exception
or simply loading and displaying text in a webbrowser.
This can only be avoided by changing and restricting the system.
I blame that on .net... More seriously, yep there is a lot of ugly C++ out
there, and OOP nonsense, I've seen my share of extra deep class hierarchies
with virtual methods everywhere. That code might have been written in C++,
but its very far from idiomatic, modern C++. The ioccc tends to show that
C is far from imune from unintelligile code.

Regards,

Maxime Coste.
Christoph Lohmann
2014-09-17 04:29:01 UTC
Permalink
Greetings.
Post by Maxime Coste
Post by Christoph Lohmann
This is programming and not your playground. Avoid fancy code.
I guess that is a matter of taste, I just know m1 + m2 calls operator+(Matrix, Matrix).
Which is, like in other answers stated, not obvious. You have to look up
the definition to read the source. Hint: Here complexity in the system
can be avoided completely.
Post by Maxime Coste
Post by Christoph Lohmann
Suckless is about writing simple code bases. When your choice for C++ is
because you want to write complex code bases then you are in the wrong
community and should leave as fast as you can.
By adding constraints on the hidden complexity it is by cause made hard‐
er to write useless abstraction code.
You cannot always simplify the problem to keep a simple solution, sometimes the
problem *is* complex. I put safe guards in the code, especially when the have
zero cost in release, because I have debugged enough clever C like code to
prefer a compilation error or an assert failing as soon as possible to a long
session in gdb.
Look at the suckless tools. If you really start from the beginning with
the intention to write complex code you will end up with complex code.
Split the problems up in tools: The Unix Philosophy. This is not your
commercial environment where you can’t afford to publish subsets before
you take tons of money from clients buying licenses from resellers, giv‐
ing nearly no support.
Post by Maxime Coste
Post by Christoph Lohmann
Post by Maxime Coste
Post by Roberto E. Vargas Caballero
No generic is a feature. Generic are very stupid idea that only creates
blown binaries (this is one of the point I don't like about C11). Also,
the compexity of generics in a lenguage with automatic conversions
like C (and C++) is too much.
what is silly is rewriting the same function with different arguments again
and again. Or ending up relying on macros to emulate generics.
Learn to code in C.
What is your strategy in C when you need to apply the same logic on different types ?
That doesn’t happen that often to justify overloading. Hint: Avoided
complexity in the system *beforehand*.
Post by Maxime Coste
Post by Christoph Lohmann
Post by Maxime Coste
Get your complexity right, inserting in a dynamic array is O(n), the eventual
need for an allocation is amortized (whereas you always end up doing a malloc
for your linked lists). Another thing you should look up is modern cpu
architectures and caches, in practice the much better locality of reference
of arrays makes them *way* better on operation like insert/erase in the middle
than lists even though complexity theory says otherwise. (Remember, complexities
are asymptotic, you need huuuuuge number of elements).
When you work close to the metal much of your theory can be optimized
out. I won’t tell you how.
The best you can do is store your linked list in an array, and sort it at certain
point so that elements end up linearly in memory. But when you've done that you
already have a dynamic array implementation.
No, the best is to apply whichever data structure you need at the mo‐
ment. Discussing which array to use makes no sense without the actual
problem defined.
Post by Maxime Coste
Post by Christoph Lohmann
No, civilisation ended when »C++« was mentioned. Code abstraction and
bad design choices made from idiots relying on OOP are the reason why
your local Windows machine is so slow in loading drivers, opening Win‐
dows, loading the help file in the regular Java NULL pointer exception
or simply loading and displaying text in a webbrowser.
This can only be avoided by changing and restricting the system.
I blame that on .net... More seriously, yep there is a lot of ugly C++ out
there, and OOP nonsense, I've seen my share of extra deep class hierarchies
with virtual methods everywhere. That code might have been written in C++,
but its very far from idiomatic, modern C++. The ioccc tends to show that
C is far from imune from unintelligile code.
You are too young, if you think .NET is the problem. The problem arose
before that with C++ in the Windows world and X11 in Unix. Let’s see
which wrong logic answer you will apply to the last argument.

Conclusion: You try to force your work experience into the suckless phi‐
losophy, which does not work due to different basic principles. Next you
are fail to accept that reading source code in Open Source is more im‐
portant than your abstraction of reusing code between hating and oppor‐
tunity‐seeking programmer groups in corporate lifes. Please come down
from your throne.



Sincerely,

Christoph Lohmann
Maxime Coste
2014-09-17 18:42:47 UTC
Permalink
Post by Christoph Lohmann
Post by Maxime Coste
Post by Christoph Lohmann
This is programming and not your playground. Avoid fancy code.
I guess that is a matter of taste, I just know m1 + m2 calls operator+(Matrix, Matrix).
Which is, like in other answers stated, not obvious. You have to look up
the definition to read the source. Hint: Here complexity in the system
can be avoided completely.
In what exactly is that different from seing mat_add(m1, m2, &m3) ? you see
m3 = m1 + m2, you know a + b is syntactic sugar for operator+(a, b). That is
all.

It seems to me that you are saying 'to someone used only to C, C++ operator
overloading is confusing', on that I can only answer yes, as pointer arithmetic
is confusing for people not used to C or C++.
Post by Christoph Lohmann
Post by Maxime Coste
What is your strategy in C when you need to apply the same logic on different types ?
That doesn’t happen that often to justify overloading. Hint: Avoided
complexity in the system *beforehand*.
That goes back to the linked list/array thing, you dont have generics, so you use the
easy thing without generics: linked lists, which are almost always a poor choice.
Post by Christoph Lohmann
Post by Maxime Coste
Post by Christoph Lohmann
Post by Maxime Coste
Get your complexity right, inserting in a dynamic array is O(n), the eventual
need for an allocation is amortized (whereas you always end up doing a malloc
for your linked lists). Another thing you should look up is modern cpu
architectures and caches, in practice the much better locality of reference
of arrays makes them *way* better on operation like insert/erase in the middle
than lists even though complexity theory says otherwise. (Remember, complexities
are asymptotic, you need huuuuuge number of elements).
When you work close to the metal much of your theory can be optimized
out. I won’t tell you how.
The best you can do is store your linked list in an array, and sort it at certain
point so that elements end up linearly in memory. But when you've done that you
already have a dynamic array implementation.
No, the best is to apply whichever data structure you need at the mo‐
ment. Discussing which array to use makes no sense without the actual
problem defined.
As I said, in practice you almost always have better performances with dynamic arrays
than lists, hence it should be your default list structure. But as it is a pain
to implement in C, because you cannot implement it generically, linked lists stay
the default implementation.
Post by Christoph Lohmann
You are too young, if you think .NET is the problem. The problem arose
before that with C++ in the Windows world and X11 in Unix. Let’s see
which wrong logic answer you will apply to the last argument.
Yay, X11 is a well known C++ program. Seriously, there are a lot of bad design
decision in Windows, as there a are a lot in modern linux userland, and
they do not seems to stem from the implementation language. the 90s where the
time of OOP everywhere, and we learned a lot since them. But rejecting the
tools because they were overused makes no sense. the C for construct can be
overused as well, we now have guidelines on that, and they are not 'for
loops are evil'.
Post by Christoph Lohmann
Conclusion: You try to force your work experience into the suckless phi‐
losophy, which does not work due to different basic principles. Next you
are fail to accept that reading source code in Open Source is more im‐
portant than your abstraction of reusing code between hating and oppor‐
tunity‐seeking programmer groups in corporate lifes. Please come down
from your throne.
I'm merely trying to defend the fact that I would like to post about a C++
project on the suckless mailing list and have it judged on its own merits
rather than on dogmatic ideas about its implementation language.

Cheers,

Maxime.
q***@c9x.me
2014-09-17 19:40:44 UTC
Permalink
Post by Maxime Coste
That doesn???t happen that often to justify overloading. Hint: Avoided
complexity in the system *beforehand*.
That goes back to the linked list/array thing, you dont have generics, so you use the
easy thing without generics: linked lists, which are almost always a poor choice.
I don't think that is true, static arrays do the job way more often than people want
to admit, and they are very well supported by C. Also, performance is critical in
less cases than people like Bjarne Stroustrup want to admit, when it's critical you
probably don't even want to rely on STL since it is not finely tuned (as Facebook's
home brewed library shows, for instance).

-- Q.
Maxime Coste
2014-09-17 19:46:36 UTC
Permalink
Post by q***@c9x.me
Post by Maxime Coste
That doesn???t happen that often to justify overloading. Hint: Avoided
complexity in the system *beforehand*.
That goes back to the linked list/array thing, you dont have generics, so you use the
easy thing without generics: linked lists, which are almost always a poor choice.
I don't think that is true, static arrays do the job way more often than people want
to admit, and they are very well supported by C. Also, performance is critical in
less cases than people like Bjarne Stroustrup want to admit, when it's critical you
probably don't even want to rely on STL since it is not finely tuned (as Facebook's
home brewed library shows, for instance).
Yep, static arrays do the trick in many cases, and the STL is not always the best
choice. What I say is that when you need a list that can handle an arbitrary number
of elements, dynamic arrays are the best bet, and you can write a generic one and
reuse it in C++, but to have one in C gets rapidly ugly and people end up prefering
a quick and dirty linked list.

In the end, its just an example to show that the lack of abstraction mechanism of
C can have bad influence on performances, because C programmers like all other do
like convenience.

Cheers,

Maxime Coste.
Roberto E. Vargas Caballero
2014-09-17 20:34:11 UTC
Permalink
Post by q***@c9x.me
Post by Maxime Coste
That goes back to the linked list/array thing, you dont have generics, so you use the
easy thing without generics: linked lists, which are almost always a poor choice.
Writing a polymorph array in c is less of 10 lines. Any decent
programmer should be able of coding it. can not you?. And, please
don't say that polymorph functions are inefcient because our migthy
super compiler also can do very good optimazations in inline
functions.

This reminds me this of Rob Pike:

Early in the rollout of Go I was told by someone that he
could not imagine working in a language without generic
types. As I have reported elsewhere, I found that an odd
remark.

To be fair he was probably saying in his own way that he
really liked what the STL does for him in C++. For the
purpose of argument, though, let's take his claim at face
value.

What it says is that he finds writing containers like lists
of ints and maps of strings an unbearable burden. I find
that an odd claim. I spend very little of my programming
time struggling with those issues, even in languages without
generic types.

But more important, what it says is that types are the way
to lift that burden. Types. Not polymorphic functions or
language primitives or helpers of other kinds, but types.

That's the detail that sticks with me.
Post by q***@c9x.me
I don't think that is true, static arrays do the job way more often than people want
to admit, and they are very well supported by C. Also, performance is critical in
less cases than people like Bjarne Stroustrup want to admit, when it's critical you
probably don't even want to rely on STL since it is not finely tuned (as Facebook's
home brewed library shows, for instance).
Yes, they don't realize how generics destroy the locality of code because
they generate huge bloated binaries.
--
Roberto E. Vargas Caballero
Teodoro Santoni
2014-09-17 19:51:31 UTC
Permalink
Post by Maxime Coste
Seriously, there are a lot of bad design
decision in Windows, as there a are a lot in modern linux userland, and
they do not seems to stem from the implementation language.
When you pull in .NET and don't rewrite the code from the ground-up but merely
put new code atop the big c++ pile, it seems to me that probably c++ is
involved.
Post by Maxime Coste
I'm merely trying to defend the fact that I would like to post about a C++
project on the suckless mailing list and have it judged on its own merits
rather than on dogmatic ideas about its implementation language.
Welp, you made it, i'm hooking the bait. You're putting the editor out every
single goddamn time someone discusses about text editors: while anyone else
discusses ideas, you start with an overview on the idea and then say "hey my
project Kakoune yo download I did this and that git pull friend". You invested
three years of your life in that project, but this is development of
**another** text editor. And if you can't think outside the kakoune box, tell
it, so i can action-drop your emails.
If I'm not allowed to talk, having never ever programmed my
programmers'editor, I will make my text editor for programmers and then talk
again about text editors development, if you promise me you'll not get around
in my project praising your Kakoune editor.
About the editor: yo, 18k lines codebase, with Java-thought types. Without
having a builtin window manager! What the hell is that kakoune rc format,
vimscript lite?


--
Teodoro Santoni
Andrew Hills
2014-09-17 22:20:10 UTC
Permalink
Post by Teodoro Santoni
Welp, you made it, i'm hooking the bait. You're putting the editor out every
So... twice?
FRIGN
2014-09-16 21:16:28 UTC
Permalink
On Tue, 16 Sep 2014 21:30:47 +0100
Post by Maxime Coste
Ok, so what exactly is the sum of 3 lines and 2 bytes ? The whole point
is to catch at compilation code that is logically invalid, if you have
f(ByteCount, LineCount), you cannot call it with a (LineCount, ByteCount)
signature. In C you would be forced to use f(int, int), and long debugging
sessions to discover this simple mistake.
It depends on the strength of the initial design. Designing function-prototypes
implicitly depends on the ability of the programmer to abstract it well enough
so these errors don't show up.
Post by Maxime Coste
It just gives you tool so that you can write your code closed to the domain
language, did you learn linear algebra writing matrix_add(m1, m2, &m3)
? or m3 = m1 + m2 ?
You basically owned yourself with this argument. With a function matrix_add(),
you know what's happening (well, you know there is a function call).

Overloaded operators are a pain in the ass not only because of complex
compilation or convoluting the code, but also gotchas when you change the scope.

It's hell for someone trying to get used to a new codebase. In C, he could've
easily checked matrix_add and studied it, whereas with overloaded operators,
it could happen _anywhere_ in the code.
Good luck finding it!
Post by Maxime Coste
You probably never worked on complicated enough code bases if you believe every
program fits entirely in your brain. Putting safe guards in you code asserting
that what you believe is correct actually is is necessary if you want to keep
your sanity.
There's a reason why big companies have moved away from C++ for serious stuff
and now use Java and other crap.
Not particularly a favorable thing for us suckless-people, but it kinda shows
that C++ is not the "commercial grade"-language you want us to believe it is.
Post by Maxime Coste
what is silly is rewriting the same function with different arguments again
and again. Or ending up relying on macros to emulate generics.
Again, a matter of design.
Post by Maxime Coste
Can't we have a civilized discussion ?
Is Maxime a male or female name?

Cheers

FRIGN
--
FRIGN <***@frign.de>
Roberto E. Vargas Caballero
2014-09-16 21:20:45 UTC
Permalink
Post by Maxime Coste
Ok, so what exactly is the sum of 3 lines and 2 bytes ? The whole point
is to catch at compilation code that is logically invalid, if you have
f(ByteCount, LineCount), you cannot call it with a (LineCount, ByteCount)
signature. In C you would be forced to use f(int, int), and long debugging
sessions to discover this simple mistake.
Can you explain me why in C you have to represent them as int or long as
not with a structure like you did in c++?.
Post by Maxime Coste
It just gives you tool so that you can write your code closed to the domain
language, did you learn linear algebra writing matrix_add(m1, m2, &m3)
? or m3 = m1 + m2 ?
It is only useful in some algebra cases where usually is better
incorporate them into the language instead of this user overloading.
Can you explaing me what is the meaning of '+' when you have for example
two lines? the addition one by one of each element or the concatenation
of them?.
Post by Maxime Coste
Post by Roberto E. Vargas Caballero
I don't need this damaged brain pointers, again if you have this kind
of problems you should learn a bit more. Like a friend of mine says,
code in C is like sex, you have to know what you are doing.
You probably never worked on complicated enough code bases if you believe every
program fits entirely in your brain. Putting safe guards in you code asserting
that what you believe is correct actually is is necessary if you want to keep
your sanity.
I have worked in complex code bases written in C++, where the use of C++
did that was impossible to keep them in the brain. In all the other
projects C and a correct division between tools make that programs can
keep in my mind. If you have pointers that are allocated in some place
and freed in another place then you code should be written in another way.
A good book for you could be 'The practice of programming'.
Post by Maxime Coste
what is silly is rewriting the same function with different arguments again
and again. Or ending up relying on macros to emulate generics.
Try to don't need such kind of things. For example Go hasn't generic and
you don't need them ever. You like them because your are so used to them
that you don't see the kind of code you are forced to write with them.
Post by Maxime Coste
Get your complexity right, inserting in a dynamic array is O(n), the eventual
need for an allocation is amortized (whereas you always end up doing a malloc
for your linked lists). Another thing you should look up is modern cpu
No. Because the realloc that you have to do maybe has to move in the
worst case n-1 elements in a new memory arena. So you have n (moving
to the position) * n-1 (memcpy to new position in memory) = n^2. You
can decrease the possibility of reallocation allocating
more memory of needed, but when you have the reallocation you get
the O(n^2). In a list you don't have ever a reallocation.
Post by Maxime Coste
architectures and caches, in practice the much better locality of reference
of arrays makes them *way* better on operation like insert/erase in the middle
than lists even though complexity theory says otherwise. (Remember, complexities
are asymptotic, you need huuuuuge number of elements).
If you read what I have said, iy you insert in the HEAD, and always take
the HEAD (FIFO fashion), you don't get any problem of locality, because
you directly access (or remove) the element you need. And now that you
are talking about deletion in dinamyc arrays, can you explain me how do you
delete in the middle of them?. AFAIK there are only two ways:

- moving one position left all the elements, that make that you
have to copy all the elements of the array, that is slow and
and destroy any locality.
- allowing holes in the array, that makes you have a bigger array
due to the holes, and to the field to mark the elements as deleted.
and I don't want to talk about the problems of inserting a new
element (in the holes or at the end?).


If you have a lot of insertions/deletions in the middle dinamyc arrays are
a very bad idea. They are only good if you always insert/remove at the end.
I usually build list inside of static arrays something like:

struct element {
/*
* element fields ...
*/
struct element *list1;
struct element *list2;
/*
* list pointers ...
*/
} list[200];

Can you explain me where is the locality problem here?. If you want to run
over the array you can, and if you want to generate lists using some other
order then you run in the order you want. With a dynamic array you only
can have one order, and if you need some other order you need an additional
structure that is a list, where you only can store the index of the array,
because, oh man reallocation of arrays change the address of the elements!!!!.

List are good for accessing one element in the order you want. Arrays
are good for accessing all the elements always in the same order.

Regards,
--
Roberto E. Vargas Caballero
Roberto E. Vargas Caballero
2014-09-16 22:16:22 UTC
Permalink
Post by Roberto E. Vargas Caballero
No. Because the realloc that you have to do maybe has to move in the
worst case n-1 elements in a new memory arena. So you have n (moving
to the position) * n-1 (memcpy to new position in memory) = n^2. You
can decrease the possibility of reallocation allocating
more memory of needed, but when you have the reallocation you get
the O(n^2). In a list you don't have ever a reallocation.
Upps, I mistook here. It is 2n, that means a complexity of O(n), so
in this point you were right, but it doesn't change too much the discusison.

For a resume of these complexity see this table I took in some place:

Linked list Array Dynamic array Balanced tree

Indexing O(n) O(1) O(1) O(log n)
Insert/delete at beginning O(1) N/A O(n) O(log n)
Insert/delete at end O(1) N/A O(1) amortized O(log n)
Insert/delete in middle search time
+ O(1) N/A O(n) O(log n)
Wasted space (average) O(n) 0 O(n)[2] O(n)

Regards,
--
Roberto E. Vargas Caballero
Maxime Coste
2014-09-16 23:03:55 UTC
Permalink
Post by Roberto E. Vargas Caballero
Post by Maxime Coste
Ok, so what exactly is the sum of 3 lines and 2 bytes ? The whole point
is to catch at compilation code that is logically invalid, if you have
f(ByteCount, LineCount), you cannot call it with a (LineCount, ByteCount)
signature. In C you would be forced to use f(int, int), and long debugging
sessions to discover this simple mistake.
Can you explain me why in C you have to represent them as int or long as
not with a structure like you did in c++?.
Because I want the convenience of adding two byte counts using +, which I can
do in C++, and have that compile down to exactly the same assembly as if I used ints,
but still get a compilation error if I try to add a byte count and a line count.
Post by Roberto E. Vargas Caballero
Post by Maxime Coste
It just gives you tool so that you can write your code closed to the domain
language, did you learn linear algebra writing matrix_add(m1, m2, &m3)
? or m3 = m1 + m2 ?
It is only useful in some algebra cases where usually is better
incorporate them into the language instead of this user overloading.
Can you explaing me what is the meaning of '+' when you have for example
two lines? the addition one by one of each element or the concatenation
of them?.
Why would I overload the addition on lines ? As you say, there is no canonical
operation that makes sense for that. I use operator overloading only on types
where they match their semantic.
Post by Roberto E. Vargas Caballero
I have worked in complex code bases written in C++, where the use of C++
did that was impossible to keep them in the brain. In all the other
projects C and a correct division between tools make that programs can
keep in my mind. If you have pointers that are allocated in some place
and freed in another place then you code should be written in another way.
A good book for you could be 'The practice of programming'.
I work on video games technologies, we have a multi milion lines code base,
written by 100+ people, we need to be able to organize it and to provide
abstraction because nobody can fit the whole thing in his mind.
Post by Roberto E. Vargas Caballero
Try to don't need such kind of things. For example Go hasn't generic and
you don't need them ever. You like them because your are so used to them
that you don't see the kind of code you are forced to write with them.
That is the complaint I mostly hear about Go, the lack of Generic. That and
garbage collection.
Post by Roberto E. Vargas Caballero
Post by Maxime Coste
Get your complexity right, inserting in a dynamic array is O(n), the eventual
need for an allocation is amortized (whereas you always end up doing a malloc
for your linked lists). Another thing you should look up is modern cpu
No. Because the realloc that you have to do maybe has to move in the
worst case n-1 elements in a new memory arena. So you have n (moving
to the position) * n-1 (memcpy to new position in memory) = n^2. You
can decrease the possibility of reallocation allocating
more memory of needed, but when you have the reallocation you get
the O(n^2). In a list you don't have ever a reallocation.
You already answered that.
Post by Roberto E. Vargas Caballero
If you read what I have said, iy you insert in the HEAD, and always take
the HEAD (FIFO fashion), you don't get any problem of locality, because
you directly access (or remove) the element you need. And now that you
are talking about deletion in dinamyc arrays, can you explain me how do you
[...]
Ok, so on a modern processor, data access pattern matters, with your linked list,
you basically have a cache miss at each access to the next element when you iterate,
when you use an array, you get linear access in memory, which means you please the
prefetcher and avoid most cache misses. That effect on performance is huge, and
in practice largely dominate the algorithmic complexity.

Additionally, the storage overhead of the linked list is interleaved with the
payload, which means you stress even more your memory accesses.

Cheers,

Maxime.
Markus Teich
2014-09-17 05:57:41 UTC
Permalink
Post by Maxime Coste
That is the complaint I mostly hear about Go, the lack of Generic. That and
garbage collection.
Why would I want to apply the same algorithm to semantically different
arguments? Appart from different kinds of number representations no useful
example comes to mind.

--Markus
Ralph Eastwood
2014-09-17 06:39:55 UTC
Permalink
Post by Maxime Coste
That last one is by far the most interesting, Bartos being very familliar with
C++. Note that its not C that is advocated, but haskell...
I would advocate functional languages (not necessarily Haskell) over C++ any day
- although implementations of functional languages have this far
disappointed me slightly.
I'm hoping a good multi paradigm language will appear some day.
Post by Maxime Coste
Post by Maxime Coste
That is the complaint I mostly hear about Go, the lack of Generic. That and
garbage collection.
Why would I want to apply the same algorithm to semantically different
arguments? Appart from different kinds of number representations no useful
example comes to mind.
That is a good point; I think the current purpose of generics in other
languages is to
enforce type checking and provide implicit pointer conversion (for
example in C if you
had a data structure/algorithm that took in a generic type argument as
a pointer).
These are conveniences, and not necessities.

However, there is one more valid use in the case of a generic
algorithm/data structure.
Imagine you have subroutines/functions that relate to your data
structure and you want
to add an item. Generics allow you to write one piece of code to deal
with all types and
let you pass the items *by value* rather than *by reference*. This
would be useful when
your item data type is small so it can be optimised by the compiler.
It's not like this can't be done in C, you can use X-Macros [1] to do this
(I realised after writing this example). I guess, it's arguably less elegant.
For more syntactic sugar, _Generic from C11 would work, but is not necessary.

On a more constructive note, I think a list of C++ paradigms that are commonly
used could be compared with their C counterparts... That would make it a useful
reference for suckless programmers.

What are people's views on literate programming for maths heavy code?
Like Maxime, operator overloading in C++ for maths code offloads the
brain so you can
focus more on the algorithmic detail and remember what you have done
when you return
to the code after a long time (or if someone else has to read it).
I think, however, literate programming using LaTeX would allow the
same degree of
expression and probably be even better for readability.
I realise suckless code doesn't tend to make use of literate
programming often, particularly
because the code is relatively simple, memory management patterns
which C programmers
would have seen countless times over, but for complicated and novel algorithms,
different thought patterns are necessary.

C++ provides error-checking facilities that C does not possess.
However, I think these are
rather limited in scope. I would rather go for verification proofs [2].

Although C is good enough for most of us, I can't help but wonder if ideas from
Forth, Haskell, Lisp, Alef (and so on) could somehow provide a clean systems
language without feature bloat. For instance in C (yes C), I can
immediately think of
6 different recursive mechanisms/keywords... is that many necessary???

[1] http://en.wikipedia.org/wiki/X_Macro
[2] http://frama-c.com/

Cheers
--
Tai Chi Minh Ralph Eastwood
***@gmail.com
Ralph Eastwood
2014-09-17 06:41:50 UTC
Permalink
Apologies, I need a better mail client.
Post by Ralph Eastwood
Post by Maxime Coste
That last one is by far the most interesting, Bartos being very familliar with
C++. Note that its not C that is advocated, but haskell...
I would advocate functional languages (not necessarily Haskell) over C++ any day
- although implementations of functional languages have this far
disappointed me slightly.
I'm hoping a good multi paradigm language will appear some day.
Post by Maxime Coste
Post by Maxime Coste
That is the complaint I mostly hear about Go, the lack of Generic. That and
garbage collection.
Why would I want to apply the same algorithm to semantically different
arguments? Appart from different kinds of number representations no useful
example comes to mind.
That is a good point; I think the current purpose of generics in other
languages is to
enforce type checking and provide implicit pointer conversion (for
example in C if you
had a data structure/algorithm that took in a generic type argument as
a pointer).
These are conveniences, and not necessities.
However, there is one more valid use in the case of a generic
algorithm/data structure.
Imagine you have subroutines/functions that relate to your data
structure and you want
to add an item. Generics allow you to write one piece of code to deal
with all types and
let you pass the items *by value* rather than *by reference*. This
would be useful when
your item data type is small so it can be optimised by the compiler.
It's not like this can't be done in C, you can use X-Macros [1] to do this
(I realised after writing this example). I guess, it's arguably less elegant.
For more syntactic sugar, _Generic from C11 would work, but is not necessary.
On a more constructive note, I think a list of C++ paradigms that are commonly
used could be compared with their C counterparts... That would make it a useful
reference for suckless programmers.
What are people's views on literate programming for maths heavy code?
Like Maxime, operator overloading in C++ for maths code offloads the
brain so you can
focus more on the algorithmic detail and remember what you have done
when you return
to the code after a long time (or if someone else has to read it).
I think, however, literate programming using LaTeX would allow the
same degree of
expression and probably be even better for readability.
I realise suckless code doesn't tend to make use of literate
programming often, particularly
because the code is relatively simple, memory management patterns
which C programmers
would have seen countless times over, but for complicated and novel algorithms,
different thought patterns are necessary.
C++ provides error-checking facilities that C does not possess.
However, I think these are
rather limited in scope. I would rather go for verification proofs [2].
Although C is good enough for most of us, I can't help but wonder if ideas from
Forth, Haskell, Lisp, Alef (and so on) could somehow provide a clean systems
language without feature bloat. For instance in C (yes C), I can
immediately think of
6 different recursive mechanisms/keywords... is that many necessary???
[1] http://en.wikipedia.org/wiki/X_Macro
[2] http://frama-c.com/
Cheers
--
Tai Chi Minh Ralph Eastwood
--
Tai Chi Minh Ralph Eastwood
***@gmail.com
FRIGN
2014-09-17 09:04:56 UTC
Permalink
On Wed, 17 Sep 2014 07:41:50 +0100
Post by Ralph Eastwood
Apologies, I need a better mail client.
Try sylpheed or mutt.
--
FRIGN <***@frign.de>
Roberto E. Vargas Caballero
2014-09-17 07:04:21 UTC
Permalink
Post by Maxime Coste
Post by Roberto E. Vargas Caballero
Can you explain me why in C you have to represent them as int or long as
not with a structure like you did in c++?.
Because I want the convenience of adding two byte counts using +, which I can
do in C++, and have that compile down to exactly the same assembly as if I used ints,
but still get a compilation error if I try to add a byte count and a line count.
Ains, poor mighty super compiler. Why do you want different count types
for bytes and lines?????, they are size_t. If you want to use the type
system to ensure the correctness of your code, instead of writing
good code then you have an understanding problem about programming.
Post by Maxime Coste
Why would I overload the addition on lines ? As you say, there is no canonical
operation that makes sense for that. I use operator overloading only on types
where they match their semantic.
This is the thing I said, overloading is only useful in a few cases, and
in almost of these cases a good language could add them to the language
itself. Almost of the c++ code style forbid overloading; Google C++
Style Guide [1]:

Cons:
- While operator overloading can make code more intuitive, it
has several drawbacks:
- It can fool our intuition into thinking that expensive
operations are cheap, built-in operations.
- It is much harder to find the call sites for overloaded operators.
Searching for Equals() is much easier than searching for relevant
invocations of ==.
- Some operators work on pointers too, making it easy to introduce
bugs. Foo + 4 may do one thing, while &Foo + 4 does something
totally different. The compiler does not complain for either of
these, making this very hard to debug.
- User-defined literals allow creating new syntactic forms that
are unfamiliar even to experienced C++ programmers.
- Overloading also has surprising ramifications. For instance,
if a class overloads unary operator&, it cannot safely be
forward-declared.

I can see you are a novice only with this defense you are doing about
overloading, that is one of the worst aspect of c++, even expert c++
programmers accept this point.
Post by Maxime Coste
I work on video games technologies, we have a multi milion lines code base,
written by 100+ people, we need to be able to organize it and to provide
abstraction because nobody can fit the whole thing in his mind.
I have worked also in video games, and with a code base of the size
you are mentioning, and I can say you that at least 70% of the code is
due to uglyness of C++. Take a look of st or dwm, that thanks they are
writing in good C they only has 4000/2000 lines of code. If you try
to do the same in C++ you will have near of 50000 for sure.

If you still want to have so big number of lines, then you could take a
look to the source of linux kernel and git, and you will see how you
can structure big programs with C.
Post by Maxime Coste
Post by Roberto E. Vargas Caballero
Try to don't need such kind of things. For example Go hasn't generic and
you don't need them ever. You like them because your are so used to them
that you don't see the kind of code you are forced to write with them.
That is the complaint I mostly hear about Go, the lack of Generic. That and
garbage collection.
You should read [2]:

C++ programmers don't come to Go because they have fought
hard to gain exquisite control of their programming domain,
and don't want to surrender any of it. To them, software
isn't just about getting the job done, it's about doing it
a certain way.

Learn to think.
Post by Maxime Coste
Post by Roberto E. Vargas Caballero
No. Because the realloc that you have to do maybe has to move in the
worst case n-1 elements in a new memory arena. So you have n (moving
to the position) * n-1 (memcpy to new position in memory) = n^2. You
can decrease the possibility of reallocation allocating
more memory of needed, but when you have the reallocation you get
the O(n^2). In a list you don't have ever a reallocation.
You already answered that.
No, you don't answered it. You are talking about locality and you forgive
that each time you reallocate you destroy your cache, because you have
to move the full array to a new position of memory.
Post by Maxime Coste
Post by Roberto E. Vargas Caballero
If you read what I have said, iy you insert in the HEAD, and always take
the HEAD (FIFO fashion), you don't get any problem of locality, because
you directly access (or remove) the element you need. And now that you
are talking about deletion in dinamyc arrays, can you explain me how do you
[...]
Ok, so on a modern processor, data access pattern matters, with your linked list,
you basically have a cache miss at each access to the next element when you iterate,
when you use an array, you get linear access in memory, which means you please the
prefetcher and avoid most cache misses. That effect on performance is huge, and
I begin to think you don't read.

First you are mixing the point about data structures and memory
layout. You can build a list with a code like this (simplified version,
without free support):

List *
allocitem(List *lp)
{
static List *buffer;
static size_t nr;
List *bp;

if (nr == 0) {
buffer = xmalloc(sizeof(*lp) * NR_BUFF);
nr = NR_BUFF;
}
bp = &buffer[--nr];
bp->next = lp;
return bp;
}

This code allocates chunks of NR_BUFF and links them in consecutive
order. Run over this list will not generate a miss in each element,
and you don't get ever a reallocation. I thought you could understand
this point with the static version. I hope you understand it now.

Second, I said that lists are good when you only access to the first
element of the list, because then you don't have problems of locality,
you only access to the first element. If you want to run over the
list then is better use another structure.

It's funny people compare list and dynamic arrays, but only optimize
arrays...
Post by Maxime Coste
Additionally, the storage overhead of the linked list is interleaved with the
payload, which means you stress even more your memory accesses.
And you don't stress the cache when you move the full array in
place, or reallocate it to a new position, nooooo. Dynamic arrays
are not an option when the size of the array is a few of thousand.
Another problem, what happens when you store pointers to the objects
of the array in another structure? You cannot do that, so you store
pointers to the objects in the dynamic array, and in each element
you have a miss. There are cases where dynamic arrays are good,
and cases where lists are good, and another cases where fixed size
arrays are the solution, but saying that C uses lists due to problems
of abstractions only means you don't understand anything.


Regards,

[1] http://google-styleguide.googlecode.com/svn/trunk/cppguide.html#Operator_Overloading
[2] http://commandcenter.blogspot.com.es/2012/06/less-is-exponentially-more.html
--
Roberto E. Vargas Caballero
FRIGN
2014-09-17 09:13:38 UTC
Permalink
On Wed, 17 Sep 2014 00:03:55 +0100
Post by Maxime Coste
Ok, so on a modern processor, data access pattern matters, with your linked list,
you basically have a cache miss at each access to the next element when you iterate,
when you use an array, you get linear access in memory, which means you please the
prefetcher and avoid most cache misses. That effect on performance is huge, and
in practice largely dominate the algorithmic complexity.
It's always funny to read when people try to game their compiler or processor into
doing something.
In reality, the most modern processors are far from the cache-hit-or-miss-machine
often described even in academic system-theory.
Instead, they _normally_ run and run until they miss cache, so all this fancy
stuff you C++-advocates try to sell us for "better performance" is bullshit.

Of course linked lists come with an overhead and arrays are faster. Nothing
new. But you're wrong in trying to pull in the processor to explain why.

Cheers

FRIGN
--
FRIGN <***@frign.de>
Alexander S.
2014-09-17 11:22:34 UTC
Permalink
Oh, another C vs C++ holy crusade, it seems.
I'd like to note here that while object-oriented progamming can be
done in C, doing polymorphism, for example, is a pain in the ass;
furthermore, syntactic sugar and an ability to write e. g.
win.repaint(rect) instead of window_repaint_rectangle(win, &rect)
actually *increases* the readability of code when you have to deal
with several lines of that fashion in a row. Syntactic sugar, say what
you want about it, tends to reduce noise, as does language support for
certain powerful programming practices.
On 17 Sep 2014 11:04 GMT +3, Roberto E. Vargas Caballero
Post by Roberto E. Vargas Caballero
If you want to use the type
system to ensure the correctness of your code, instead of writing
good code then you have an understanding problem about programming.
That's an odd thing to hear from a professional like you. Making
mistakes isn't about "writing good code" (what does this even mean per
se) or not, it's about the fact that humans are not infallible. If
someone knows their own mistakes and wants to use the benefits of
proper type system (which C unfortunately lacks altogether and C++
attempts to fix the C mess are, while valiant, broken as well), you
cannot possibly condemn him for that.
The fact that in C++, you have to use ad-hoc structs for that instead
of just defining a new numeric type is, of course, another sad topic
altogether.

TLDR: C++ does many things to simplify a programmer's life, and you
cannot deny that. But it is also undeniable that it does them all
poorly, with rather obscure semantics even.
Post by Roberto E. Vargas Caballero
Post by Maxime Coste
Can't we have a civilized discussion ?
Is Maxime a male or female name?
...I wonder if you find your own question rude or not.
FRIGN
2014-09-17 11:33:19 UTC
Permalink
On Wed, 17 Sep 2014 15:22:34 +0400
"Alexander S." <***@gmail.com> wrote:

Hey Alexander,
Post by Alexander S.
furthermore, syntactic sugar and an ability to write e. g.
win.repaint(rect) instead of window_repaint_rectangle(win, &rect)
actually *increases* the readability of code when you have to deal
with several lines of that fashion in a row. Syntactic sugar, say what
you want about it, tends to reduce noise, as does language support for
certain powerful programming practices.
Have you ever heard of function pointers?
Post by Alexander S.
TLDR: C++ does many things to simplify a programmer's life, and you
cannot deny that. But it is also undeniable that it does them all
poorly, with rather obscure semantics even.
So when C++ does a poor job at doing many things to simplify a
programmer's life, it doesn't simplify a programmer's life.
Imho, the opposite is the case.
Post by Alexander S.
...I wonder if you find your own question rude or not.
Well, take a guess.

Cheers

FRIGN
--
FRIGN <***@frign.de>
Maxime Coste
2014-09-17 18:25:49 UTC
Permalink
Post by FRIGN
On Wed, 17 Sep 2014 15:22:34 +0400
Hey Alexander,
Post by Alexander S.
furthermore, syntactic sugar and an ability to write e. g.
win.repaint(rect) instead of window_repaint_rectangle(win, &rect)
actually *increases* the readability of code when you have to deal
with several lines of that fashion in a row. Syntactic sugar, say what
you want about it, tends to reduce noise, as does language support for
certain powerful programming practices.
Have you ever heard of function pointers?
Tell me more, how do you implicitely bind the 'this' parameter with a
function pointer ?
Post by FRIGN
Post by Alexander S.
TLDR: C++ does many things to simplify a programmer's life, and you
cannot deny that. But it is also undeniable that it does them all
poorly, with rather obscure semantics even.
So when C++ does a poor job at doing many things to simplify a
programmer's life, it doesn't simplify a programmer's life.
Imho, the opposite is the case.
C++ is not perfect, that is all, but rejecting imperfects improvements
is not the solution. C++'s biggest problem right now is backward
compatibility with both C, and previous versions of itself. Lots of
things have been tried, some worked well, others less well, we learned
and have guidelines. Every C++ programmer I know would love a cleaned
up version of the language, but there is just too much existing code
for that to happen.

C share the same problems, we cannot really say that the _Keywords that
got added to C99 and C11 are very elegant, the whole string library
has written deprecated on all the unsafe functions, the preprocessor
can be a pain, and the whole compilation model does not scale.

All I'm saying here is, sucky programs can be written in any language,
and I do not like to see Kakoune rejected on the sole ground of its
implementation language.

Cheers,

Maxime Coste.
Jimmie Houchin
2014-09-17 13:50:25 UTC
Permalink
Post by Alexander S.
Oh, another C vs C++ holy crusade, it seems.
As a participant. I hope it can remain a reasonably rational, reasonably
fact based conversation even if it is a passionate one.

The reason I engaged the discussion is that I don't like reading
arguments against C++ that were written against and before C++11/14 and
its preferred model of programming. When there are possibly C++
advocates who may agree with some of the pre-C++11/14 arguments against,
but say we don't program that way.

It is easy to disregard old arguments when you don't know if they have
much fact for the current situation.
Post by Alexander S.
Post by FRIGN
Post by Maxime Coste
Can't we have a civilized discussion ?
Is Maxime a male or female name?
...I wonder if you find your own question rude or not.
I think it is hard to imply anything here. Email is a incredibly
difficult medium to convey things across multiple cultures and
languages. And many not native to English.

It could simply be how do I refer to you as a he or she in conversation.
Even with my own name. It is predominantly male. I am male. very male,
husband, father, grandfather. But there are a reasonable number of women
with this name. So it is hard to assume anything, especially across
cultures and languages.


Jimmie
Roberto E. Vargas Caballero
2014-09-17 15:05:33 UTC
Permalink
Post by Alexander S.
Post by Roberto E. Vargas Caballero
If you want to use the type
system to ensure the correctness of your code, instead of writing
good code then you have an understanding problem about programming.
That's an odd thing to hear from a professional like you. Making
mistakes isn't about "writing good code" (what does this even mean per
se) or not, it's about the fact that humans are not infallible. If
Yes, but I don't agree the type system must be used for this kind of
errors. What happens if you mix variables about count of lines of
different buffers? The type system will not detect anything, and
if you go ahead and declare a type for the line count of every
buffer then the code will be horrible.

If you want detect this kind of errors you can apply another techniques,
but not the type system. If you have a variable that is a integer count,
then the correct type is an integer.
Post by Alexander S.
someone knows their own mistakes and wants to use the benefits of
proper type system (which C unfortunately lacks altogether and C++
attempts to fix the C mess are, while valiant, broken as well), you
cannot possibly condemn him for that.
Yes, and this is the reason why you cannot pass a char * to a
function that expects a const char *, where a function that doesn't
modify the content of the buffer should accept a non constant array.
The solution in c++, a cast, hides another errors, like for example
you pass a pointer to int. Usually when you have a cast is because
you are doing something wrong (at least with pointers), and you
have to use a lot of them in c++. I think the mess is the type
system of c++.

Regards,
--
Roberto E. Vargas Caballero
Martti Kühne
2014-09-17 15:23:00 UTC
Permalink
I like javascript, and I would love to see more Javascript in
suckless software.

In JavaScript I can abuse the lack of a typesystem, and can
build functions that return configurable, callable
objects, returning other objects for which is given what data they
will hold and what functions they will execute.

With functions taking such factories as arguments
I have an even more powerful tool at hand than C++ templates
and only need to define in what spot what object can decide
what is done.

Would you, in memoriam of this accidental C++ thread,
include javascript highlighting in vis?

cheers!
mar77i
Maxime Coste
2014-09-17 18:15:51 UTC
Permalink
Post by Roberto E. Vargas Caballero
Yes, but I don't agree the type system must be used for this kind of
errors. What happens if you mix variables about count of lines of
different buffers? The type system will not detect anything, and
if you go ahead and declare a type for the line count of every
buffer then the code will be horrible.
If you want detect this kind of errors you can apply another techniques,
but not the type system. If you have a variable that is a integer count,
then the correct type is an integer.
Are you saying we should not use a tool to help with a problem if it
is not able to help with every similar problems ?

I can eliminate a good bunch of hard to catch bugs, think about how
easier working with utf8 (where confusing byte counts and char counts
happens easily) when your compiler tells you 'nope, that cannot possibly
be right'. I have access to a tool, its not perfect, I'd rather use
a strong typedef or somothing, but I wont reject it for that.
Post by Roberto E. Vargas Caballero
Post by Alexander S.
someone knows their own mistakes and wants to use the benefits of
proper type system (which C unfortunately lacks altogether and C++
attempts to fix the C mess are, while valiant, broken as well), you
cannot possibly condemn him for that.
Yes, and this is the reason why you cannot pass a char * to a
function that expects a const char *, where a function that doesn't
modify the content of the buffer should accept a non constant array.
The solution in c++, a cast, hides another errors, like for example
you pass a pointer to int. Usually when you have a cast is because
you are doing something wrong (at least with pointers), and you
have to use a lot of them in c++. I think the mess is the type
system of c++.
Oh boy, you don't know much about C++ do you, of course a char* is
implicitely convertible to const char*. On pointer semantics, the
only difference between C and C++ is that void* is not implicitely
convertible to other pointer types (from is ok).

Cheers,

Maxime Coste.
Dimitris Papastamos
2014-09-17 18:23:53 UTC
Permalink
Post by Maxime Coste
Oh boy, you don't know much about C++ do you, of course a char* is
That's a blessing right? :)
Roberto E. Vargas Caballero
2014-09-17 20:34:20 UTC
Permalink
Ok, this is going to be my last mail in this discussion
(we have a sentence for this in spanish: don't try to teach
play flute to a donkey, because first you waste your
time, and second you disturb the donkey).
Post by Maxime Coste
Post by Roberto E. Vargas Caballero
If you want detect this kind of errors you can apply another techniques,
but not the type system. If you have a variable that is a integer count,
then the correct type is an integer.
Are you saying we should not use a tool to help with a problem if it
is not able to help with every similar problems ?
What part of 'another techniques' you don't understand? Use the
correct tool for this topic, and type system is not the correct
tool here. In fact I cannot remember the last time I had an error
like this you are describing, do you have them usually? maybe you
should change your way of coding and avoid them, or maybe you just
realized you don't have them?
Post by Maxime Coste
Oh boy, you don't know much about C++ do you, of course a char* is
Today is one of my happiest days, I just realized that after a lot
of years I am beginning to forgive c++. Thank you for showing it
to me.

Regards,
--
Roberto E. Vargas Caballero
M Farkas-Dyck
2014-09-16 02:07:51 UTC
Permalink
Post by Maxime Coste
And for C++, well, I know there is some vocal individuals against it on the
sl mailing list, but I think most members are sensible, we do not need to
stay frozen with C89, C++ is bigger than C, more complex, but provides a lot
of abstraction features that makes it easier to reason and organize your
program.
There some are vocal individuals against it elsewhere too [1][2]. They
are not frozen with C89. I am not frozen with C89. You seem to imply a
false dichotomy.

[1] http://harmful.cat-v.org/software/c++/linus
[2] http://gigamonkeys.wordpress.com/2009/10/16/coders-c-plus-plus/
Markus Teich
2014-09-16 07:37:44 UTC
Permalink
Post by M Farkas-Dyck
[1] http://harmful.cat-v.org/software/c++/linus
[2] http://gigamonkeys.wordpress.com/2009/10/16/coders-c-plus-plus/
Heyho,

also relevant for the (non-)topic:
http://bartoszmilewski.com/2013/09/19/edward-chands/

--Markus
Jimmie Houchin
2014-09-16 22:25:02 UTC
Permalink
Post by Markus Teich
Post by M Farkas-Dyck
[1] http://harmful.cat-v.org/software/c++/linus
[2] http://gigamonkeys.wordpress.com/2009/10/16/coders-c-plus-plus/
Heyho,
http://bartoszmilewski.com/2013/09/19/edward-chands/
--Markus
Thanks for that last link. I had read the first two.

I have been for the last several weeks (months) researching what
language I want to use to implement a couple of apps I want to do.

So I have this internal debate in me as to whether or not to learn C
and/or C++. One one hand I tend towards C. But for someone who has spent
the last 20 years using learning Python and Smalltalk. C looks pretty
primitive. C++ looks complicated. And C++ OOP does not look a thing like
Smalltalk OOP.

The sticky wicket in their for me is that I must connect to either C++
or Java libraries. One of my apps has a C wrapper around the C++
library. So any language that can connect to C can use this library. But
many of the required libraries I need to use are in C++.

So I explore using C++ and the apologetics for such sound nice. But I
still look at all the complexity. Then I look at C and its
primitiveness. (Or at least seemingly so from the outside.)

And neither natively provide the interactiveness of Python/Smalltalk.
Which is something I require. So I would need then to add either Python
or probably Lua into the equation.

I was about to go C++, but then did a little exploration into the
current state of Julia. And I've decided to go with Julia and C. Julia
for the high level part of the app and C when I need to optimize, do
something that Julia isn't able to do, or get closer to the machine.

I am not saying that Julia is suckless. I don't know what the suckless
opinion of Julia is. I am not 100% suckless. But I am tending toward
suckless but it is a battle with the software forces at work.

The reason I write, is that in my research for pros and cons of C verses
C++. Almost all of the anti-C++ writings are pre 2011 and therefore pre
C++11/C++14 and the coming C++17.

So as someone who knows neither C nor C++ sufficiently. And is
researching to make as qualified a decision as possible. The information
on how the newer C++ versions are viewed by the anti-C++ group.

I would love to see some people do some apologetics on why not C++ for
the current C++. So that people can make educated decision on the
current C verses the current C++.

Seeing how much C++ people complain about the C like stuff or the actual
C stuff in C++. Why don't they just grow a pair and clean out all the
stuff they complain about. Simplify the language and get on with it. As
it is, is seems as if is just growing and nothing gets removed. Only new
books saying don't use the old stuff. If you don't want it used then
remove it. Ugh!

My apologies for the mini-rant. And Hi! First time poster to suckless.
Thanks for having a group which fights against the current direction in
complexity in software.

Jimmie
FRIGN
2014-09-16 22:45:27 UTC
Permalink
On Tue, 16 Sep 2014 17:25:02 -0500
Jimmie Houchin <***@gmail.com> wrote:

Hey Jimmie,
Post by Jimmie Houchin
Seeing how much C++ people complain about the C like stuff or the actual
C stuff in C++. Why don't they just grow a pair and clean out all the
stuff they complain about. Simplify the language and get on with it. As
it is, is seems as if is just growing and nothing gets removed. Only new
books saying don't use the old stuff. If you don't want it used then
remove it. Ugh!
Because it's one of C++'s design goals to be backwards-compatible to C.
It's the only thing I'd attribute to the C language.
Given this design-goal, the result has been pretty remarkable.

But apart from that, I need a programming language to solve problems for
me efficiently.
And no other language has surpassed C for me (by far!).
Many of those apologetics trying to sweet-talk C++ are actually quite
obsessed with the fact they wasted years learning a language no human
can possibly learn to the fullest.

The strongest argument for me against C++ is not a technical one, but
the fact that you are forced to program in subsets.
This leads to the problem that new developers planning on contributing
to a project might have problems with adapting to it because it uses
a different subset of the C++-language than they are accustomed to.

I personally started with C++ a few years back when I began with system
programming.
The more I do with C and read about the problems C++-developers have,
I'm glad about having made the switch to C, even though it was harder
to learn in the beginning.
Post by Jimmie Houchin
My apologies for the mini-rant. And Hi! First time poster to suckless.
Thanks for having a group which fights against the current direction in
complexity in software.
You're welcome.

Cheers

FRIGN
--
FRIGN <***@frign.de>
Jimmie Houchin
2014-09-17 21:15:52 UTC
Permalink
Post by FRIGN
On Tue, 16 Sep 2014 17:25:02 -0500
Hey Jimmie,
Post by Jimmie Houchin
Seeing how much C++ people complain about the C like stuff or the actual
C stuff in C++. Why don't they just grow a pair and clean out all the
stuff they complain about. Simplify the language and get on with it. As
it is, is seems as if is just growing and nothing gets removed. Only new
books saying don't use the old stuff. If you don't want it used then
remove it. Ugh!
Because it's one of C++'s design goals to be backwards-compatible to C.
It's the only thing I'd attribute to the C language.
Given this design-goal, the result has been pretty remarkable.
For me this is exactly where they need to figure out who they are or who
they want to be.

Being source code compatible with C is somewhat a violation of:

http://en.wikipedia.org/wiki/C%2B%2B
There should be no language beneath C++ (except assembly language).

http://www.stroustrup.com/bs_faq.html
"Within C++, there is a much smaller and cleaner language struggling to
get out".
Yes, that quote can be found on page 207 of The Design and Evolution of C++.


You can't maintain source code compatibility without having C inside of
C++. And if you have C inside of C++, then you have a language beneath
C++. Since they consider C++ to be higher level than C. And they exhort
profusely that you should not program in a C style.

The smaller and cleaner language will never ever get out, if it only
keeps getting bigger and nothing ever gets removed.

Now, it would not be difficult to maintain C binary compatibility. Lots
of languages do.

It seems to me that what they really need to develop the courage to do
is this. For the C++17 standard. Decide on what that smaller and cleaner
language should be. Make strong deprecations and warnings for use of
stuff outside of the smaller, cleaner C++, but still compile if that is
the users wish. Then create a C++20 (or so) standard and remove all of
that stuff. If you are C++ developer and you like and want all of that
stuff outside of the smaller, cleaner C++. Stick with C++17 and earlier.

So they either need to change their philosophy of now language beneath
C++. Give up their desire of a smaller, cleaner C++. Or just give in to
being the beast they have and are creating. They need to go under the knife.

If at some point they don't gain the courage, they with have an ever
increasing monster. And an ever increasing publishing industry based
around what not to do in C++. :)
They are already well down this road.

They didn't ask my opinion, but oh well.
Post by FRIGN
But apart from that, I need a programming language to solve problems for
me efficiently.
And no other language has surpassed C for me (by far!).
Many of those apologetics trying to sweet-talk C++ are actually quite
obsessed with the fact they wasted years learning a language no human
can possibly learn to the fullest.
And to me that is one of the attractive things about C vs C++. I believe
over time I can fit C in my head. C++ not so much. And by the time
anybody gets enough of their head wrapped around it, it probably has
grown and changed again.
Post by FRIGN
The strongest argument for me against C++ is not a technical one, but
the fact that you are forced to program in subsets.
This leads to the problem that new developers planning on contributing
to a project might have problems with adapting to it because it uses
a different subset of the C++-language than they are accustomed to.
This is what scared me in considering C++. I control my own code. I can
easily program in that self-defined C++ sweet spot. But I am required to
use other people's libraries. I may have to read other peoples code. I
then have to step out of my choices into theirs. Then I could have
undefined consequences or headaches.
Post by FRIGN
I personally started with C++ a few years back when I began with system
programming.
The more I do with C and read about the problems C++-developers have,
I'm glad about having made the switch to C, even though it was harder
to learn in the beginning.
I hope I too can have a good C experience.
Post by FRIGN
Post by Jimmie Houchin
My apologies for the mini-rant. And Hi! First time poster to suckless.
Thanks for having a group which fights against the current direction in
complexity in software.
You're welcome.
Cheers
FRIGN
Jimmie
FRIGN
2014-09-18 08:25:41 UTC
Permalink
On Wed, 17 Sep 2014 16:15:52 -0500
Post by Jimmie Houchin
It seems to me that what they really need to develop the courage to do
is this. For the C++17 standard. Decide on what that smaller and cleaner
language should be. Make strong deprecations and warnings for use of
stuff outside of the smaller, cleaner C++, but still compile if that is
the users wish. Then create a C++20 (or so) standard and remove all of
that stuff. If you are C++ developer and you like and want all of that
stuff outside of the smaller, cleaner C++. Stick with C++17 and earlier.
So they either need to change their philosophy of now language beneath
C++. Give up their desire of a smaller, cleaner C++. Or just give in to
being the beast they have and are creating. They need to go under the knife.
If at some point they don't gain the courage, they with have an ever
increasing monster. And an ever increasing publishing industry based
around what not to do in C++. :)
They are already well down this road.
Or you just start using Go.
Post by Jimmie Houchin
This is what scared me in considering C++. I control my own code. I can
easily program in that self-defined C++ sweet spot. But I am required to
use other people's libraries. I may have to read other peoples code. I
then have to step out of my choices into theirs. Then I could have
undefined consequences or headaches.
Yes, well said.

Cheers

FRIGN
--
FRIGN <***@frign.de>
Ralph Eastwood
2014-09-17 22:26:41 UTC
Permalink
Post by FRIGN
The strongest argument for me against C++ is not a technical one, but
the fact that you are forced to program in subsets.
This leads to the problem that new developers planning on contributing
to a project might have problems with adapting to it because it uses
a different subset of the C++-language than they are accustomed to.
I personally started with C++ a few years back when I began with system
programming.
The more I do with C and read about the problems C++-developers have,
I'm glad about having made the switch to C, even though it was harder
to learn in the beginning.
Adding to that, most existing C++ codebases and libraries have such
wildly differring styles, writing glue code is a daunting task.
Personally, I've done some C++ for various reasons and I'm fairly able to
read C++ and write "clean" C++
as well as understand the Boost libraries (although some of the
template metaprogramming in there is beyond insane).

I'm surprised you found C was more difficult to begin with; I found C++'s
quirks far more mindboggling to me. Then again, I taught myself x86
assembly programming before C - that made pointers no mystery to me
at all
(I've heard many folks complain bitterly about not understanding pointers).
--
Tai Chi Minh Ralph Eastwood
***@gmail.com
Markus Teich
2014-09-17 05:50:05 UTC
Permalink
I have been for the last several weeks (months) researching what language I
want to use to implement a couple of apps I want to do.
What kind of apps are you planning to write?
So I have this internal debate in me as to whether or not to learn C and/or
C++. One one hand I tend towards C. But for someone who has spent the last 20
years using learning Python and Smalltalk. C looks pretty primitive. C++ looks
complicated. And C++ OOP does not look a thing like Smalltalk OOP.
See the „primitivity“ of C as a benefit. In the beginning you may have to think
a little harder to fit something into these „limitations“ but in the end it pays
off, since you don't have to struggle with much OOP complexity when maintaining
your code.
The sticky wicket in their for me is that I must connect to either C++ or Java
libraries. One of my apps has a C wrapper around the C++ library. So any
language that can connect to C can use this library. But many of the required
libraries I need to use are in C++.
Any C++ library that pretends to be sane also has C bindings.
And neither natively provide the interactiveness of Python/Smalltalk. Which
is something I require. So I would need then to add either Python or probably
Lua into the equation.
You could also try Go (http://golang.org/), which has syntax similar to
C/C++/Java, compiles to binaries, feels like python and allows for a very
interestingly restricted way of OOP.
The reason I write, is that in my research for pros and cons of C verses C++.
Almost all of the anti-C++ writings are pre 2011 and therefore pre C++11/C++14
and the coming C++17.
Did you check if the authors of the anti-C++ postings changed their opinion in
2011? They probably did not and their critic is still valid.
And Hi! First time poster to suckless. Thanks for having a group which fights
against the current direction in complexity in software.
Welcome to sl, Jimmie.

--Markus
Jimmie Houchin
2014-09-17 14:28:41 UTC
Permalink
Post by Markus Teich
I have been for the last several weeks (months) researching what language I
want to use to implement a couple of apps I want to do.
What kind of apps are you planning to write?
What I am working on right now is a quantitative analysis trading
application. And in this field most use C++. However, this is for me and
my business in my spare time. So, language choice is my mine within the
limitations of the ability to use one of the brokers proprietary
libraries to connect with their server. Most brokers offer C++, Java and
.Net. I considered Clojure and Java. But I just struggle with the idea
of that Java elephant. I have never used a Java app that felt performant
or memory efficient or even memory reasonable.
Post by Markus Teich
So I have this internal debate in me as to whether or not to learn C and/or
C++. One one hand I tend towards C. But for someone who has spent the last 20
years using learning Python and Smalltalk. C looks pretty primitive. C++ looks
complicated. And C++ OOP does not look a thing like Smalltalk OOP.
See the „primitivity“ of C as a benefit. In the beginning you may have to think
a little harder to fit something into these „limitations“ but in the end it pays
off, since you don't have to struggle with much OOP complexity when maintaining
your code.
I won't argue with that. I know that their will be quite a different
mindset and model for programming. But as a long time Smalltalker
(Squeak/Pharo). I have struggled leaving Smalltalk. The Smalltalk
language and environment are very productive and very immersive. OO that
is much more functional than C++/Java, etc.

First I have to find a language I want to try.
Then I have to find an editor I like. In Smalltalk, the IDE, environment
and language are all together. Then possibly a compiler, debugger and
tool chain.

Nothing seems simple when leaving Smalltalk.

Then with your statically compiled languages. You have the fact that
there is no ad hoc explorability at all. Whereas with something like
Smalltalk, Python, Lua, Clojure, Julia, something with a REPL you can
dynamically explore. You can write your app. Run your app and then
explore your app and data live while it is running. Very powerful for a
large set of applications.

This is something not so obvious in C/C++ for someone on the outside
looking in.

This is why I am looking at learning Julia and C. Julia is not OO. Julia
is very strong in math, science, technical computing. Julia interfaces C
almost effortlessly. It is high performance and reasonably efficient.
Julia/C is a seemingly perfect fit for my app's domain.

Julia offers one of the best REPL experiences I've seen. For me this is
a big win.
Post by Markus Teich
The sticky wicket in their for me is that I must connect to either C++ or Java
libraries. One of my apps has a C wrapper around the C++ library. So any
language that can connect to C can use this library. But many of the required
libraries I need to use are in C++.
Any C++ library that pretends to be sane also has C bindings.
I understand. However, nobody's pretending these people are sane. After
all, they did write it in C++ in the first place. :)
Post by Markus Teich
And neither natively provide the interactiveness of Python/Smalltalk. Which
is something I require. So I would need then to add either Python or probably
Lua into the equation.
You could also try Go (http://golang.org/), which has syntax similar to
C/C++/Java, compiles to binaries, feels like python and allows for a very
interestingly restricted way of OOP.
I have spent 20+ years avoiding C/C++. Always looking for the more
productive, faster to market, ... solutions. I sincerely regret not
learning C in the beginning. It has seriously limited my decisions.

I have decided I am going to correct that oversight. So much of what I
want to do interfaces C/C++. If your chosen tool hasn't already
implemented a wrapper around said C/C++ library it is then it is up to
you. If you don't have C skills. You are dependent upon someone who does.

For me this stops now. I am going to work on my C skills first. Before I
move on to Julia or whatever more "productive", more "dynamic" environment.

Because I am attempting to learn modern C. This is currently the books I
have that am working through.

C Primer Plus, 6th ed., Stephen Prata
21st Century C, 2nd ed., Ben Klemens
Understanding and Using C Pointers, Richard Reese
Post by Markus Teich
The reason I write, is that in my research for pros and cons of C verses C++.
Almost all of the anti-C++ writings are pre 2011 and therefore pre C++11/C++14
and the coming C++17.
Did you check if the authors of the anti-C++ postings changed their opinion in
2011? They probably did not and their critic is still valid.
I don't necessarily agree. They may have the same opinions. But, that
doesn't mean that their arguments remain valid against "Modern C++".
Their arguments didn't address "Modern C++". So their document may still
express their opinion. Because they very well may still be anti C++. But
that doesn't mean they have one bit of knowledge of "Modern C++".
"Modern C++" being C++ according to the newest standards and used
according to the best practices proscribed by C++ advocates.

And even if they are expert in "Modern C++", their documents don't
address it. So, for those of us on the outside. It is unknown how in an
apologetic or debate they would address the points of the C++ advocate.
Post by Markus Teich
And Hi! First time poster to suckless. Thanks for having a group which fights
against the current direction in complexity in software.
Welcome to sl, Jimmie.
--Markus
Moving in a suckless direction is hard. You have to fight the entire
industry.

The state of Linux is exasperating. The systemd fiasco is frustrating. I
feel like I have turned around and am swimming upstream. I am learning
to use more keyboard and less mouse. More terminal/console and less gui.
But I am not ready for mutt and vim yet. Maybe someday. I do still like
some creature comforts.

Thanks for engaging in my questions. I appreciate input even when I
disagree. For what I disagree with today, may be what I agree with
tomorrow. Who knows. :)

Jimmie

Thanks.
Dimitris Papastamos
2014-09-17 14:45:46 UTC
Permalink
Post by Jimmie Houchin
C Primer Plus, 6th ed., Stephen Prata
21st Century C, 2nd ed., Ben Klemens
Understanding and Using C Pointers, Richard Reese
I am not familiar with these books, I learnt from K&R and I'd recommend
you do the same. Once you've gone through the book and the exercises you
can studying/modifying modern C code. If you are going to do UNIX
programming then study APUE by Richard Stevens.

The Practice of Programming by Kernighan and Pike can be studied on
the side.
Maxime Coste
2014-09-16 23:43:23 UTC
Permalink
Post by Markus Teich
Post by M Farkas-Dyck
[1] http://harmful.cat-v.org/software/c++/linus
[2] http://gigamonkeys.wordpress.com/2009/10/16/coders-c-plus-plus/
Heyho,
http://bartoszmilewski.com/2013/09/19/edward-chands/
That last one is by far the most interesting, Bartos being very familliar with
C++. Note that its not C that is advocated, but haskell...

Cheers,

Maxime Coste.
Marc André Tanner
2014-09-15 19:03:59 UTC
Permalink
Post by Maxime Coste
Hello,
Here are a few thought on your design, based on my own experience with Kakoune.
Thanks for sharing them! Kakoune looks interesting. At some later point
I will take a closer look.
Post by Maxime Coste
Post by Marc André Tanner
Text management using a piece table/chain
=========================================
[...]
While this looks like a nice data structure for editing arbitrary byte
string, you can get much better actual performances if you decide you write
a text/code editor.
I'm not sure whether this is actually the case. That is part of the reason
why I wrote it in the first place, to see how it behaves in practice. I'm not
aware of any other console style, open source editors for *nix systems
using a similar data structure (the closest I came was AbiWord).

I find the piece chain rather elegant. In particular the undo/redo
implementation and the fact that it naturally leads to a read only mmap(2)
based solution. That is the buffer management is taken care of by the operating
systems virtual memory subsystem.
Post by Maxime Coste
regular text is naturally line/column oriented, and storing it in the form of
a dynamic array of lines (with lines being simple strings) works very well and
gives excellent performance once you use (line, column) pairs to reference it.
I agree that this works fine for source code and other small files. However
I'm optimistic that the piece chain will work satisfactory too. Besides there
already exist plenty of editors based on dynamic arrays. I really saw no point
in creating another one (besides the fun, learning experience etc.).
Post by Maxime Coste
In practice your user think about text in this line column fashion, which
implies that your text editing will stay mostly line column centric, so
things ends up much simpler when the editing backend itself is matching that.
As I said I'm not totally convinced this is actually the case. Especially if
you have to deal with characters which occupy multiple columns etc. Having
a byte based addressing seems to work ok so far. For example I like the fact
that you can write your movements as functions which take the current file
position as argument and return the new one. All completely independent from
the display state.
Post by Maxime Coste
That said, this is limited to actual text, arbitrary byte sequences do not
map well to this, in which case your piece table seems nice.
I agree. But for now I still think the piece table also works fine for text
files.
Post by Maxime Coste
Post by Marc André Tanner
Screen Drawing
==============
[...]
Window-Management
-----------------
In principle it would be nice to follow a similar client/server approach
as sam/samterm i.e. having the main editor as a server and each window
as a separate client process with communication over a unix domain socket.
[...]
The client server thing can stay quite simple if you avoid any synchronisation.
in Kakoune once the connection is done, the client sends keystrokes,
and the server sends display commands. Once you have your poll event loop
(which you will endup having if you want to handle anything asynchrounously)
this integrate very easily.
Yes it doesn't have to be rocket since, but at this stage other things
seem more important. It also depends on what kind of assumpion you make.
For example once you properly want to handle partial read/writes, things
just became rather more complicated.
Post by Maxime Coste
Post by Marc André Tanner
Editor Frontends
================
vis a vim like frontend
-----------------------
[...]
So it seems you are basically targetting very close to vi interface, I am
always a little sad to see new editors doing that. vi and vim have tons of
good ideas in them, but the editing model has a lot of room for improvement
to get a more consistent and regular interface.
This is true to some degree. The problem is that most people are already
familiar with vi(m) (myself included). Therefore the hope is that by sticking
to the vim conventions more contributors will be attracted. As for my personal
needs they are almost covered by the currently implemented functionality.

It would be interesting to know which parts of the editing model you consider
particularly problematic.
Post by Maxime Coste
Kakoune is one direction, integrating multi-selection and focusing on
interactive edition, which gave very good results in term of keystrokes count
(it beats vim on several vimgolf challenges).
I agree Kakoune seems interesting, I will have to read more about it.
Post by Maxime Coste
I expect there are lots of
alternatives directions to improve the vi-like user interface, and trying
to improve the implementation without trying to improve the design itself
seems like a waste.
For me these two things are largely unrelated. A solid foundation should
make it easy to experiment with new editing paradigms. I certainly encourage
people to try different things.
Post by Maxime Coste
Anyway, best of luck on your project, writing a code editor is a very
rewarding experience.
Thanks! Again, I agree.

Cheers,
Marc
--
Marc André Tanner >< http://www.brain-dump.org/ >< GPG key: CF7D56C0
Andrew Hills
2014-09-15 20:24:45 UTC
Permalink
Post by Marc André Tanner
This is true to some degree. The problem is that most people are already
familiar with vi(m) (myself included). Therefore the hope is that by sticking
to the vim conventions more contributors will be attracted. As for my personal
needs they are almost covered by the currently implemented functionality.
And one advantage of vi(m) is that I can get it on basically any
platform, so even if I don't have access to vis or other related
projects, as long as the interfaces are similar, I will not have much
difficulty. But, if the editor's separation allows for other, better
editing interfaces, and the program's simplicity means that it can be
easily compiled on BSDs and Linuxes with all the various supporting
libraries and versions thereof, then I would have no problem learning a
new tool (provided it is actually better than vi).
Dmitrij D. Czarkoff
2014-09-15 13:48:31 UTC
Permalink
Post by Marc André Tanner
Editor Frontends
================
The editor core is written in a library like fashion which should make
it possible to write multiple frontends with possibly different user
interfaces/paradigms.
At the moment there exists a barely functional, non-modal nano/sandy
like interface which was used during early testing. The default interface
is a vim clone called vis.
The frontend to run is selected based on the executable name.
While I am probably too late to the party, I would suggest to separate
your "vis" into two distinct parts on the same principle as vi and ex:
the should be an ex-like CLI editor accepting commands from stdin and
printing output to stdout, and there should be separate UI that wraps
ex-like CLI, sending commands and parsing output. Such split would make
possible to have static build of ex-like CLI on embedded device and
controling it with local interface over SSH/telnet.

P.S.: there is already a program called "vis" – replacement for "cat -v"
as suggested by Rob Pike and Brian Kernighan.[1] Probably another name
could be chosen. I like "vie" option, although "svi" also appears to be
available (according to OpenBSD ports at least).

[1] http://harmful.cat-v.org/cat-v/unix_prog_design.pdf
--
Dmitrij D. Czarkoff
Gregor Best
2014-09-16 11:56:14 UTC
Permalink
Hi guys,

I've got two patches and three questions:

First, the patches. The first fixes editing of length 0 files, the second fixes
compilation on OpenBSD. Since _BSD_SOURCE was already present in other files
belonging to vis, I figured adding it to vis.c as well poses no harm.

The first question is about line numbers. Would a patch adding display of line
numbers be accepted, or is that considered unnecessary cruft? I find it makes
jumping through a file a bit easier.

My other question is about piping (part of) the buffer to an external command,
such as fmt. Is someone already working on that or is that something I could
start looking into? Piping would also provide a (stop-gap) solution for :s, for
example by piping to sed or some other stream editor.

I'm also seeing quite a bit of display corruption when opening large syntax
highlighted files, such as window.c and scrolling around the file for a while.
Is anyone else seeing that or is it just me? I'm using tmux and st, if that
matters.
--
Gregor Best
Marc André Tanner
2014-09-16 19:01:20 UTC
Permalink
Post by Gregor Best
Hi guys,
First, the patches. The first fixes editing of length 0 files,
Thanks applied with some cosmetic changes. I'm not sure the perror call makes
that much sense and since the other error paths do not contain it, I've left
it out for now.
Post by Gregor Best
the second fixes
compilation on OpenBSD. Since _BSD_SOURCE was already present in other files
belonging to vis, I figured adding it to vis.c as well poses no harm.
I actually also got another patch related to these defines:

This fixes warning with latest glibc (>= 2.19.90), which deprecated _BSD_SOURCE

warning: _BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE

I saw that musl also added support for _DEFAULT_SOURCE in a recent commit.

Therefore I'm adding:

#define _DEFAULT_SOURCE
#define _POSIX_SOURCE /* needed for sigaction */
#define _BSD_SOURCE /* needed for OpenBSD SIGWINCH */

Could a libc expert (nsz if you happen to read this?) comment on whether this
makes sense or to what should be used instead?
Post by Gregor Best
The first question is about line numbers. Would a patch adding display of line
numbers be accepted, or is that considered unnecessary cruft? I find it makes
jumping through a file a bit easier.
If there exists a way to disable them, then yes. This should basically amount
to a loop from win->topline till win->lastline always printing line->lineno.
Post by Gregor Best
My other question is about piping (part of) the buffer to an external command,
such as fmt. Is someone already working on that or is that something I could
start looking into? Piping would also provide a (stop-gap) solution for :s, for
example by piping to sed or some other stream editor.
I agree this is needed, feel free to work on it ...
Post by Gregor Best
I'm also seeing quite a bit of display corruption when opening large syntax
highlighted files, such as window.c and scrolling around the file for a while.
Is anyone else seeing that or is it just me? I'm using tmux and st, if that
matters.
Could you elaborate a bit more on what you mean by corruption? Wrong syntax
highlighting / colors, text not showing up at all ...

If "just" the colors are screwed up, it is probably due to a recent commit
which tries to make syntax highlighting more efficient. I will have to take
another look.

In any case make sure to use the latest git version which has some code
cleanups in window.c (not related to drawing though).

Thanks,
--
Marc André Tanner >< http://www.brain-dump.org/ >< GPG key: CF7D56C0
Marc André Tanner
2014-09-17 14:56:59 UTC
Permalink
Post by Marc André Tanner
Operators
---------
planned: > (shift-right), < (shift-left)
Those are now also (at least in a rudimentary way) implemented.

Could some vim expert on the list tell me whether it is possible to
indent the next n lines by m levels in vim?

All combinations I tried like: n>mj simply indent the next n*m lines
by one level.
--
Marc André Tanner >< http://www.brain-dump.org/ >< GPG key: CF7D56C0
Christian Neukirchen
2014-09-17 15:27:46 UTC
Permalink
Post by Marc André Tanner
Post by Marc André Tanner
Operators
---------
planned: > (shift-right), < (shift-left)
Those are now also (at least in a rudimentary way) implemented.
Could some vim expert on the list tell me whether it is possible to
indent the next n lines by m levels in vim?
Vnjm>

It doesn't work without visual mode I think.

OTOH, you can use :5>>> e.g.
--
Christian Neukirchen <***@gmail.com> http://chneukirchen.org
Silvan Jegen
2014-09-17 15:32:51 UTC
Permalink
Post by Marc André Tanner
Post by Marc André Tanner
Operators
---------
planned: > (shift-right), < (shift-left)
Those are now also (at least in a rudimentary way) implemented.
Could some vim expert on the list tell me whether it is possible to
indent the next n lines by m levels in vim?
It is, kind of...

You can use visual mode for that like this

vnjm>

where n and m are your count variables.
ale rimoldi
2014-09-17 20:12:46 UTC
Permalink
hi marc andré,

first thanks for your fine vis... i spent about an hour trying it out,
while taking notes on what i'm missing...

it misses too many features i use in my everyday work with vim, but it
was a positive experience.

i'll probably send my comments very soon...
Post by Marc André Tanner
Could some vim expert on the list tell me whether it is possible to
indent the next n lines by m levels in vim?
All combinations I tried like: n>mj simply indent the next n*m lines
by one level.
i can't say i'm an expert, since i use a small subset of vim features,
but i'm for sure an heavy user.

i would say that you don't really need an indenting of n lines by m
levels.

the resulting command is -- in my eyes -- a bit too complex. and there
are simple workarounds that use general available features.

the simplest one already works in vis. indent n lines (or better a
"{" inner area) and repeat the action with dots (eventually correcting
the exceeding indenting with u).
this is in my experience faster than wondering which of n or m comes
before the > and which after.

and i think that you agree with me, that you should not have too long
and to deep indents.

the other workaround is to use == (automatic aligning) on the next n
lines (or, again, on i{). which will mostly automatically do the m
indenting you're looking for.
personally, i only use == on single lines (or very few lines), but it
should do what the n>mj you're proposing would do.

ciao
a.l.e
Marc André Tanner
2014-09-18 16:33:38 UTC
Permalink
Post by ale rimoldi
hi marc andré,
first thanks for your fine vis... i spent about an hour trying it out,
while taking notes on what i'm missing...
Please share your findings.
Post by ale rimoldi
it misses too many features i use in my everyday work with vim, but it
was a positive experience.
i'll probably send my comments very soon...
Please do so, it would be very helpful.
Post by ale rimoldi
Post by Marc André Tanner
Could some vim expert on the list tell me whether it is possible to
indent the next n lines by m levels in vim?
All combinations I tried like: n>mj simply indent the next n*m lines
by one level.
i can't say i'm an expert, since i use a small subset of vim features,
but i'm for sure an heavy user.
i would say that you don't really need an indenting of n lines by m
levels.
I tend to agree. Nevertheless I was curious how it works.
Post by ale rimoldi
the resulting command is -- in my eyes -- a bit too complex. and there
are simple workarounds that use general available features.
the simplest one already works in vis. indent n lines (or better a
"{" inner area) and repeat the action with dots (eventually correcting
the exceeding indenting with u).
this is in my experience faster than wondering which of n or m comes
before the > and which after.
and i think that you agree with me, that you should not have too long
and to deep indents.
Yes.
Post by ale rimoldi
the other workaround is to use == (automatic aligning) on the next n
lines (or, again, on i{). which will mostly automatically do the m
indenting you're looking for.
personally, i only use == on single lines (or very few lines), but it
should do what the n>mj you're proposing would do.
== will certainly not be implemented internally by the editor ...
--
Marc André Tanner >< http://www.brain-dump.org/ >< GPG key: CF7D56C0
Hugues Evrard
2014-09-19 12:45:37 UTC
Permalink
Hi all,
Post by ale rimoldi
Post by Marc André Tanner
Could some vim expert on the list tell me whether it is possible to
indent the next n lines by m levels in vim?
the simplest one already works in vis. indent n lines (or better a
"{" inner area) and repeat the action with dots (eventually correcting
the exceeding indenting with u).
this is in my experience faster than wondering which of n or m comes
before the > and which after.
Disclaimer: What follows is just my point of view, and definitely *not*
a feature request!

I would like to raise a point on the level of abstraction a text editor
offers, when editing code in particular.

1. The basic editing commands are really *text* oriented, and they do
not depend on what kind of text your are editing: go to a position in
the file, insert a character, find a sequence of characters, and the like.

2. Other commands begin to be on a higher level, and introduce concepts
that may depend on the nature of the text, e.g. "erase a word":
- how a word is defined ? sequence of characters between "whitespace?
(e.g. in japanese there is no whitespace between words)
- how do you define: newline, line, whitespace, paragraph, word
separators? (e.g. is underscore a word separator?)

I guess most editors have a common ground on such definitions, and/or
offers a way to define them. For instance in AWK one can redefine the
"field separator" (FS) to control how a line is split.

3. When you edit code, you work on structured text with code-related
abstractions: the code may contain comments, function definitions, etc.
The editor may know about this to alter the editing interface, the most
prominent example being syntax highlighting. Such information may also
be useful for editing commands, as you can use new abstractions such as
"comments", "function", "block" in the editing commands.

My point is that some of these abstractions can be defined independently
of the programming language code under edition. For instance, a comment
is a comment, be it in C, Shell Script or LaTeX sources, each with their
particular comment syntax. I think using such programming-language
agnostic abstractions is useful to define uniform editing commands.

Let me give an example:
- we want to comment a block of code
- we edit C-like code, i.e.:
- let's say a block of code is defined by text between the next "{"
and its matching "}"
- comment means add "//" at the beginning of the line

To realize this on "level 2" commands, we may use these commands:
- select text between "{" and next "}"
- add "//" at the beginning of all lines of text selection
(and I feel this is how most vi users work, but I might be wrong)

To realize this on "level 3" commands, we may use these commands:
- select "block of code"
- comment current selection

==> Level 3 commands do not depend on the actual syntax ("{", "//"
characters), but on code abstractions.
==> We can use *the very same commands* for LaTeX sources for instance,
where "{" would be "\begin", "}" would be "\end" and "//" would be "%".

However, such abstractions require the editor to know the structure of
language being edited. If this means having a parser for each language
under edition, then I think it's better to not have such abstractions.
Still, the information needed for syntax highlighting may be enough to
offer some basic abstractions, which may be obtained using regexp (maybe
regexps are already too heavy?):
- comments
- block of codes
- beginning and end of function definitions
- text between two matching "parenthesis" (same abstraction for
"(...)", "[...]", "{...}", etc)

=== TL;DR ===
- editing command could use abstractions that are independent from the
actual source code syntax
- e.g., say "comment this line" rather than "add <comment chars> at the
beginning of the line"
- pro: offer uniform commands to edit different prog languages.
- cons: requires knowledge of the prog language that is being edited

PS: first real post for me, I happily read the list since months!
--
Hugues Evrard
Martti Kühne
2014-09-19 12:59:00 UTC
Permalink
Post by Hugues Evrard
- we want to comment a block of code
- let's say a block of code is defined by text between the next "{"
and its matching "}"
- comment means add "//" at the beginning of the line
- select text between "{" and next "}"
- add "//" at the beginning of all lines of text selection
(and I feel this is how most vi users work, but I might be wrong)
- select "block of code"
- comment current selection
==> Level 3 commands do not depend on the actual syntax ("{", "//"
characters), but on code abstractions.
==> We can use *the very same commands* for LaTeX sources for instance,
where "{" would be "\begin", "}" would be "\end" and "//" would be "%".
Hmm. adding /* and */ around the selection are two times two key presses.
But I do see your approach being useful in languages where this isn't available.

Then again, what we actually do with our editors would look like:
Comment out this function (/* */), go on with a copy of the above, so we can
out-of-order undo...

Or we might select an amount of lines and make another function from them.
This might be going places.

cheers!
mar77i
Raphaël Proust
2014-09-19 13:20:05 UTC
Permalink
Post by Hugues Evrard
[…]
However, such abstractions require the editor to know the structure of
language being edited. If this means having a parser for each language
under edition, then I think it's better to not have such abstractions.
Still, the information needed for syntax highlighting may be enough to
offer some basic abstractions, which may be obtained using regexp (maybe
- comments
- block of codes
- beginning and end of function definitions
- text between two matching "parenthesis" (same abstraction for
"(...)", "[...]", "{...}", etc)
I think that the biggest problem is that not all languages have these.

Haskell doesn't really have “blocks of code” (although admittedly the
monad notation…) nor does lisp. Instead they have expressions which
are more tree like than list like. The nesting of code in C is
shallower than in lisp which means different priorities for the
movement and range operations for the different languages.

On the other hand, some languages have features you do not cover.

ML (and some others) have a powerful module system that requires more
bindings than C. Lisp, ML, Haskell, Scala, Rust, &c. have pattern
matches that are sort of like switches but not really. &c.


All in all, I think that a high-level, language agnostic editor is
very difficult. It might actually lead to IDEs and other such monsters
that support a given list of languages but are actually a pain to use
as a general tool.

Note that some editors make it more or less easy to integrate a
language specific tool-chain. (Acme is really good at that because of
its handling of IO, its 9p interface, and the sam editing language.
Vim is terrible and it takes mad geniuses like tpope to get descent
filetype plugins.)


Consider the merlin project for OCaml. They forked the language parser
and typer to make a daemon that provides IDE-like features for editors
to interact with. I think this design is simpler: editors to edit
text, they talk to a server about line numbers and character offsets,
the server understands these positions as meaningful code entities.


Cheers,
--
______________
Raphaël Proust
Maxime Coste
2014-09-19 13:22:16 UTC
Permalink
Hello
Post by Hugues Evrard
=== TL;DR ===
- editing command could use abstractions that are independent from the
actual source code syntax
- e.g., say "comment this line" rather than "add <comment chars> at the
beginning of the line"
- pro: offer uniform commands to edit different prog languages.
- cons: requires knowledge of the prog language that is being edited
I have spent quite some time thinking about this syntax object problem,
and I think it is mostly attainable with a layered approach:

Most language syntax objects are easy to express in term to generic text objects:

- A C function is a { } block preceded by a parenthesis block and a few words
whose last is not a control flow keyword
- A python function is an indent block preceded by a parenthesis block preceded
by a word and 'def'
- A lisp function is a parenthesis block whose first word is defun

That wont be 100% correct, but good enough for most use case.

Based on this, I think that with an expressive enough editing language you
can easily define that from basic builtin blocks.

That was one of the motivations for swapping selection and operation order in
Kakoune (haters gonna hate...), by decoupling selections from the operator, you
can express arbitrarily complex selection operations, you have a (rather limited)
version of that with vim visual mode.

Once you have that, you can provide the upper level operations by defining them
by language in term of basic operations.

The advantages (IMHO) of defining higher level constructs like that are the following:

- It keeps the language specific ones out of the core
- It force for an expressive set of general purpose core commands

The hard part being finding that expressive set of core commands.

The problem with parsers is that besides being quite slow if data driven, they
are not very good at analysing invalid code, which is the most common state of
code being edited.

Cheers,

Maxime Coste.
Raphaël Proust
2014-09-19 14:13:28 UTC
Permalink
Post by Maxime Coste
[…]
That was one of the motivations for swapping selection and operation order in
Kakoune (haters gonna hate...), by decoupling selections from the operator, you
can express arbitrarily complex selection operations, you have a (rather limited)
version of that with vim visual mode.
Once you have that, you can provide the upper level operations by defining them
by language in term of basic operations.
From what I understand of Kakoune (correct me if I'm wrong) the
selection system is a dynamic/interactive equivalent to the structural
regular expressions of sam/Acme.

1: Can you comment on the difference in expressiveness? (Assuming you
tested Acme/sam, otherwise don't bother.)

2: Does Kakoune have selection macro? I.e., a way to repeat the same
selection keystroke input. Or even better a way to go and edit the
selection pattern like q: does for commands in vim. That would
actually allow to define higher level constructs and reuse them across
projects of the same language, &c.
Post by Maxime Coste
The problem with parsers is that besides being quite slow if data driven, they
are not very good at analysing invalid code, which is the most common state of
code being edited.
Actually, modern parsers can deal with that. They have failure modes
that allow restarting after a parse failure (i.e., parse the next code
block even if the current one is invalid). Incremental parsing also
helps with speed.


Cheers,
--
______________
Raphaël Proust
Maxime Coste
2014-09-19 17:31:35 UTC
Permalink
Hello,
Post by Raphaël Proust
Post by Maxime Coste
[…]
That was one of the motivations for swapping selection and operation order in
Kakoune (haters gonna hate...), by decoupling selections from the operator, you
can express arbitrarily complex selection operations, you have a (rather limited)
version of that with vim visual mode.
Once you have that, you can provide the upper level operations by defining them
by language in term of basic operations.
From what I understand of Kakoune (correct me if I'm wrong) the
selection system is a dynamic/interactive equivalent to the structural
regular expressions of sam/Acme.
1: Can you comment on the difference in expressiveness? (Assuming you
tested Acme/sam, otherwise don't bother.)
So, I have not used sam/acme a lot, just played around with them, but basically
we should have a similar expressiveness:

* select all regex matches/split on matches (in already selected text) is one of
kakoune basics. This is the equivalent to sam x/y commands.

* Another building block is keep/remove selections containing a match to a given
regex.

* virtually all selection operations are recursive, they are applied to already
existing selections.

* Due to multiselections, an operation naturally applies to every selections, so the
looping nature of x in sam is preserved

In kakoune, x would be s, to remove all instance of 'string' in a buffer you would do
%sstring<ret>d with % selecting the whole buffer, s opening a prompt for a regex (where
we enter string and validate with <ret>), after this ret we have every instance of
string selected. and the d command deletes selected text.

On top of that you have access to vi style movements. so to remove the words following
string, you could replace d with wd in the previous command.

Having the vi style movement probably improves the expressiveness because we can express
things that are notably hard to do with regex (select enclosing { .. } block
respecting nesting for example).
Post by Raphaël Proust
2: Does Kakoune have selection macro? I.e., a way to repeat the same
selection keystroke input. Or even better a way to go and edit the
selection pattern like q: does for commands in vim. That would
actually allow to define higher level constructs and reuse them across
projects of the same language, &c.
Kakoune has vi-style macros (record and replay), they are just a list of keystrokes so they
can select and modify the text.

In addition to that, can map a set of keystrokes to a key in order to add your own
selection primitives.

All the auto indentation in kakoune is implemented as user hooks that triggers keystrokes,
here are a few things we are able to express (from the C/C++ indenter):

* on newline, increase indent if previous line ends with { or (
* on newline, align to opening ( if we are in a non closed parenthesis
(respecting nesting)
* on closing brace, auto insert the ';' if we are closing a
class/struct/enum/union

As said in my previous email, the hard part is defining a nice set of operations that gives
this expressiveness without becoming a big mess, in particular when one goal is to stay
interactive (so, 1 key commands as much as possible).


Cheers,

Maxime Coste.
ale rimoldi
2014-09-23 18:03:45 UTC
Permalink
hi marc andré

as announced a few days ago, here is my write down about vis.

you probably won't agree with each point in there...



for many years now, i've been using vim as my main text editor. both for code and for typing text.

while i consider myself a vim heavy user, when i browse through the vim manual, i have to accept the fact that i probably do not use more than 10% of the features.
it's proably rather around the 1% or 5%, but i don't see a way to really count it...
and this bothers me: how bloated is a software when a heavy, and curious, user only knows (and cares about) a tiny fraction of the features?

(something maybe important to say: i use vim as distributed, with no plugins, with very few lines in the .vimrc file)

on top of that, while i never had a look at the vim source code, i've read scaring tales about it.

this made me curious about how "good" a suckless version of vim would be for me.

and while testing vis, i started fearing that each of the many heavy users out there uses a different small subset of all the vim goodness: this could explain, why it's so bloated. let's hope that it's not true.

i did a vis test drive, i was rather happy with the result, but i also noticed many "every day features" that are missing... and a few ( really not many) bugs.

and i also found some features that i liked to see not implemented (like U, the linewise undo; ex; K)


here in details the result.


first, there are two features that you explicit do not plan to implement but are (very) important to me

- macro recording (really really really... except if you implement multiple selections... but even then... or at least macro playback!)
- visual block mode (very practical for reformatting text; sometimes useful for code, too)

the four "bugs" that most annoyed me while typing this text with vis:

- o does not go into insert mode (how easy it is to switch my habits to oi?)
- a does not append
- end of line is not the last char in the line but the end of line character
- P does not paste before

and here the only real bug i noticed:

- <count>> indents count+1 lines



and then there is a longer list of commands i really missed (i mean: things that really slow down my workflow and i'd probably need all -- or at least most -- of them, if i want vis to to replace vim...
some of them have already been mentioned, other not.

:sav filename

expandtab mode with :set ts and :set sw; eventually :retab
:se si for smart indenting
== automatically indent the current line
automatically insert the comment sign at the beginning of the next line (it's about the #s and *s)
:paste :nopaste modes (if the commands above are implemented)

zt zz zb redraw the current line at the top, center or bottom of the screen

o O should go into insert mode after adding the line
a appends at the current cursor place
. should also repeat insert actions
$ should jump to the last char in the line not on the hidden return char
in normal mode, when the cursor is at the end of the line it should be on the last char.
x should not delete the end of line character (but this might be solved with the placement issue above)
implement the difference between word and WORD

allow changing buffer without first saving the current buffer
:b# go to the previous buffer
:bd to delete a buffer
:ls list of buffers
:bd <int> delete <int> buffer
:b <int> go to <int> buffer
:vnew :new to create new windows
ctrl-w lhkj to move among windows

in command mode, deleting the : should go back to normal mode
up and down arrows / ctrl-n/p to browser the command mode history

in insert mode ctrl-p and ctrl-n should propose an autocompletion with all known words from the open buffers

:! to run a command
!! to run a command and insert its result as the current line

dd does not work on the last line
pasting a full line should put the cursor at the beginning of the pasted line
P should paste before the cursor
xp should switch the position of the characters at the cursor

gU and gu for uppercase and lowercase
g0 gm g$ moves to the begining/center/end of the screen line

V for line selection
ctrl-v for visual block selection

q to register macros
@ to play macros

gk / gj should keep the last set column (if it moved to a shorter line and then again to a longer one)
<num>gg should go to <num> line
ctrl-o and ctrl-i should jump to the previous cursor position (if possible, only inside the current buffer)

show where the file content ends (the vim's ~, but it can be solved in a different way!)

/pattern/e modifier to move the cursor to the end of the match
,; for the next/previous match in the character find
* finds the next matching word


and here a few nice to have:

- digraphs
- "special" registers (calculation, current filename)
- inserting registers with ctrl-r (ctrl-r% inserts the filename in the file itself)
- optional syntax highlighting
- i'm not sure that having j/k behaving like gj/gk is a good idea
Raphaël Proust
2014-09-24 08:03:09 UTC
Permalink
Post by ale rimoldi
:paste :nopaste modes (if the commands above are implemented)
By the way, if you use st you can have paste/nopaste switch
automatically. See
https://github.com/raphael-proust/rcs/blob/master/home/user/.vimrc#L218
for details.
Post by ale rimoldi
[…]
I agree with some of your “X should do Y” comments. The others I just
don't care about. But that just goes towards your point that “each of
the many heavy users out there uses a different small subset” and
bloatedness.
Post by ale rimoldi
- digraphs
Out of curiosity, do you actually use digraphs? Why not compose?
Post by ale rimoldi
- "special" registers (calculation, current filename)
I find acme's way of dealing with it both simpler, cleaner, sucklesser
and more unix-like: when calling out to a shell, set the environment
variable $% to the filename. I'd really prefer a vim-like editor to
not try to implement a whole lot of commands that can just be
shelled-out and use something acme-like.

Additionally, having a good shelling-out mechanism (that allows piping
content) also offers a solution to many of the things you mention
(retab
Post by ale rimoldi
- inserting registers with ctrl-r (ctrl-r% inserts the filename in the file itself)
With the aforementioned environment variable and sell-out trick this
becomes :r!echo $% (admittedly longer, but then you can :r!basename $%
or :r!echo $pwd/$% etc.)
Post by ale rimoldi
- optional syntax highlighting
I still don't know of any editor doing that cleanly… (code-wise)
Post by ale rimoldi
- i'm not sure that having j/k behaving like gj/gk is a good idea
I fully agree with you on this one.



Cheers,
--
______________
Raphaël Proust
Marc Weber
2014-09-24 08:50:35 UTC
Permalink
Post by ale rimoldi
on top of that, while i never had a look at the vim source code, i've
read scaring tales about it.
Some cases where the vim source code appears very hacky to me:
http://vim-wiki.mawercer.de/wiki/topic/in-which-way-does-vim-suck.html

Rewriting an editor? Have a a look at existing solutions
- neovim
- jedit (java editor, no vi keybindings, but you could look at its
history to understand the effort it takes to write an editor)
- ...

Idea/Eclipse/Netbeans all offer Vi Keybindings (maybe extra plugin)

Just to get an idea about how much work it is (just have a look at the
history).

If to rewrite, using what language? C++? Sucks - gtk&gimp/suckless/ more
teams (jackaudio) cannot decide which language to use. They end up with:
"Hey, we have two versions now: one written in C having features A, and
one in C++ having B".
That's why I concluded that I had to write a new language first. Where
to start? Have a look at impredicative.com/ur, Disciple (strict Haskell
dialect, memory regions and such), or rust (Mozilla is writing a multi
core web engine using that) ?

That's why I didn't start a project such as neovim yet.

Also keep in mind that even Emacs has a Vim emulation
http://www.emacswiki.org/Evil

There is also "yi" written in Haskell which offers Vim and Emacs like
keybinding flavours.

And there is a lot more around (editors written ruby on github and
whatnot). What makes Vim still productive, even though it has many
flaws? Its community. An editor without a community is worthless.
Post by ale rimoldi
and while testing vis, i started fearing that each of the many heavy
this could explain, why it's so bloated. let's hope that it's not
true.
It totally is. Its vim, its the mailinglist, its the "plugin X which
provides feature Y" - I tried switching to Emacs once, but gave up after
2 weeks. Too much code to rewrite.

Thus before rewriting from scratch find out what you really want and
where to start - and make sure you have either funding or enough "money"
for living first .. ^^

That's my view.
Marc Weber
Martti Kühne
2014-09-24 11:10:49 UTC
Permalink
Post by Marc Weber
Rewriting an editor? Have a a look at existing solutions
You forgot to mention the historic example for a group of people that
aimed at everything DIY and got stuck with an over-bloated text editor
which some people jokingly call "a great OS btw, lacking only a decent
text editor". You know, the one that comes with its own LISP
interpreter...

cheers!
mar77i

Continue reading on narkive:
Loading...