의존적으로 타이핑하지 않는 이유는 무엇입니까?
나는 여러 출처에서 "Haskell이 점차 의존형 언어가되고있다"는 의견을 반향시켰다. 더 많은 언어 확장으로 Haskell은 일반적인 방향으로 표류하고 있지만 아직은 그렇지 않습니다.
기본적으로 알고 싶은 두 가지가 있습니다. 첫 번째는, "종종 의존형 언어"라는 것이 실제로 무엇을 의미 하는가? (기술적으로 너무 기술적 인 일이 없기를 바랍니다.)
두 번째 질문은 ... 단점은 무엇입니까? 사람들은 우리가 그런 식으로 가고 있다는 것을 알고 있기 때문에 이점이 있어야합니다. 그러나 우리는 아직 거기에 없기 때문에 사람들을 완전히 방해하는 몇 가지 단점이 있습니다. 문제가 복잡성이 급격히 증가하고 있다는 인상을받습니다. 그러나 의존적 인 타이핑이 무엇인지 실제로 이해하지 못하면 확실하지 않습니다.
내가 할 알고 것은 내가 의존적 형식의 프로그래밍 언어에 대한 책을 읽은 시작할 때마다이 텍스트가 문제의 그 아마도 ... 완전히 이해할 수없는 것입니다. (?)
의존적 타이핑은 실제로는 값과 유형 수준의 통일이므로 유형에 대한 값을 매개 변수화 할 수 있으며 (Haskell의 유형 클래스 및 매개 변수 다형성으로 이미 가능) 값에 대한 유형을 매개 변수화 할 수 있습니다 (엄격히 말하지는 않지만 Haskell에서는 가능) DataKinds
아주 가까이 있지만 ).
편집 : 분명히,이 시점부터 나는 틀렸다 (@pigworker의 의견 참조). 나는 내가 먹인 신화의 기록으로 이것의 나머지를 보존 할 것이다. :피
내가 들었던 것에서 완전 의존형 타이핑으로 옮기는 문제는 타입과 값 레벨 사이의 단계 제한을 깨뜨려 하스켈이 지워진 유형의 효율적인 기계 코드로 컴파일 될 수 있다는 것입니다. 현재 수준의 기술을 사용하면 종속 형식화 된 언어 는 특정 시점 (즉, 종속 형식화 된 바이트 코드 또는 이와 유사한 형식으로 컴파일 된 후)에서 인터프리터를 거쳐야 합니다 .
이것은 반드시 근본적인 제한은 아니지만, 이와 관련하여 유망 해 보이지만 아직 GHC로 만들지 않은 현재 연구에 대해서는 개인적으로 알고 있지 않습니다. 다른 사람이 더 많은 것을 알고 있다면 기꺼이 시정 될 것입니다.
의존적으로 입력 된 Haskell, 지금?
하스켈은 어느 정도 의존적으로 형식화 된 언어입니다. 타입 레벨 데이터에 대한 개념이 있는데, 이제 덕분에 더 잘 타이핑되었습니다 DataKinds
. 그리고 GADTs
타입 레벨 데이터에 런타임 표현을 제공하는 몇 가지 수단 ( )이 있습니다. 따라서 런타임 항목의 값이 types에 효과적으로 표시 되므로 언어를 종속적으로 입력해야합니다.
단순 데이터 유형은 종류 수준 으로 승격 되어 포함 된 값을 유형에 사용할 수 있습니다. 그러므로 전형적인 예
data Nat = Z | S Nat
data Vec :: Nat -> * -> * where
VNil :: Vec Z x
VCons :: x -> Vec n x -> Vec (S n) x
가능해지면서
vApply :: Vec n (s -> t) -> Vec n s -> Vec n t
vApply VNil VNil = VNil
vApply (VCons f fs) (VCons s ss) = VCons (f s) (vApply fs ss)
좋네요 길이 n
는 해당 함수에서 정적으로 정적이므로 입력 및 출력 벡터의 길이는 동일하지만 실행에 아무런 영향을 미치지 않더라도 동일 vApply
합니다. 반면에, 그것은 만드는 기능을 구현하는 (즉, 불가능) 훨씬 복잡합니다 n
주어진의 사본 x
(있을 것이다 pure
에 vApply
의 '를 <*>
)
vReplicate :: x -> Vec n x
런타임에 사본 수를 아는 것이 중요하기 때문입니다. 싱글 톤을 입력하십시오.
data Natty :: Nat -> * where
Zy :: Natty Z
Sy :: Natty n -> Natty (S n)
모든 승격 가능한 유형의 경우, 값의 런타임 복제본에 의해 승격 된 승격 된 유형에 대해 인덱스 된 싱글 톤 패밀리를 구축 할 수 있습니다. Natty n
type-level의 런타임 사본 유형 n :: Nat
입니다. 우리는 지금 쓸 수 있습니다
vReplicate :: Natty n -> x -> Vec n x
vReplicate Zy x = VNil
vReplicate (Sy n) x = VCons x (vReplicate n x)
따라서 런타임 레벨로 요크되는 유형 레벨 값이 있습니다. 런타임 사본을 검사하면 유형 레벨 값에 대한 정적 지식이 세분화됩니다. 용어와 유형이 분리되어 있지만 단일 톤 구조를 에폭시 수지의 종류로 사용하여 상 사이에 결합을 만들어 의존적으로 유형을 지정할 수 있습니다. 형식에 임의의 런타임 식을 허용하는 데는 먼 길이지만 아무것도 아닙니다.
불쾌한 무엇입니까? 무엇이 빠졌습니까?
이 기술에 약간의 압력을 가하고 흔들기 시작하는 것을 봅시다. 싱글 톤을 좀 더 암시 적으로 관리 할 수 있어야한다는 생각이들 수도 있습니다.
class Nattily (n :: Nat) where
natty :: Natty n
instance Nattily Z where
natty = Zy
instance Nattily n => Nattily (S n) where
natty = Sy natty
우리가 글을 쓸 수 있도록
instance Nattily n => Applicative (Vec n) where
pure = vReplicate natty
(<*>) = vApply
That works, but it now means that our original Nat
type has spawned three copies: a kind, a singleton family and a singleton class. We have a rather clunky process for exchanging explicit Natty n
values and Nattily n
dictionaries. Moreover, Natty
is not Nat
: we have some sort of dependency on run-time values, but not at the type we first thought of. No fully dependently typed language makes dependent types this complicated!
Meanwhile, although Nat
can be promoted, Vec
cannot. You can't index by an indexed type. Full on dependently typed languages impose no such restriction, and in my career as a dependently typed show-off, I've learned to include examples of two-layer indexing in my talks, just to teach folks who've made one-layer indexing difficult-but-possible not to expect me to fold up like a house of cards. What's the problem? Equality. GADTs work by translating the constraints you achieve implicitly when you give a constructor a specific return type into explicit equational demands. Like this.
data Vec (n :: Nat) (x :: *)
= n ~ Z => VNil
| forall m. n ~ S m => VCons x (Vec m x)
In each of our two equations, both sides have kind Nat
.
Now try the same translation for something indexed over vectors.
data InVec :: x -> Vec n x -> * where
Here :: InVec z (VCons z zs)
After :: InVec z ys -> InVec z (VCons y ys)
becomes
data InVec (a :: x) (as :: Vec n x)
= forall m z (zs :: Vec x m). (n ~ S m, as ~ VCons z zs) => Here
| forall m y z (ys :: Vec x m). (n ~ S m, as ~ VCons y ys) => After (InVec z ys)
and now we form equational constraints between as :: Vec n x
and VCons z zs :: Vec (S m) x
where the two sides have syntactically distinct (but provably equal) kinds. GHC core is not currently equipped for such a concept!
What else is missing? Well, most of Haskell is missing from the type level. The language of terms which you can promote has just variables and non-GADT constructors, really. Once you have those, the type family
machinery allows you to write type-level programs: some of those might be quite like functions you would consider writing at the term level (e.g., equipping Nat
with addition, so you can give a good type to append for Vec
), but that's just a coincidence!
Another thing missing, in practice, is a library which makes use of our new abilities to index types by values. What do Functor
and Monad
become in this brave new world? I'm thinking about it, but there's a lot still to do.
Running Type-Level Programs
Haskell, like most dependently typed programming languages, has two operational semanticses. There's the way the run-time system runs programs (closed expressions only, after type erasure, highly optimised) and then there's the way the typechecker runs programs (your type families, your "type class Prolog", with open expressions). For Haskell, you don't normally mix the two up, because the programs being executed are in different languages. Dependently typed languages have separate run-time and static execution models for the same language of programs, but don't worry, the run-time model still lets you do type erasure and, indeed, proof erasure: that's what Coq's extraction mechanism gives you; that's at least what Edwin Brady's compiler does (although Edwin erases unnecessarily duplicated values, as well as types and proofs). The phase distinction may not be a distinction of syntactic category any longer, but it's alive and well.
Dependently typed languages, being total, allow the typechecker to run programs free from the fear of anything worse than a long wait. As Haskell becomes more dependently typed, we face the question of what its static execution model should be? One approach might be to restrict static execution to total functions, which would allow us the same freedom to run, but might force us to make distinctions (at least for type-level code) between data and codata, so that we can tell whether to enforce termination or productivity. But that's not the only approach. We are free to choose a much weaker execution model which is reluctant to run programs, at the cost of making fewer equations come out just by computation. And in effect, that's what GHC actually does. The typing rules for GHC core make no mention of running programs, but only for checking evidence for equations. When translating to the core, GHC's constraint solver tries to run your type-level programs, generating a little silvery trail of evidence that a given expression equals its normal form. This evidence-generation method is a little unpredictable and inevitably incomplete: it fights shy of scary-looking recursion, for example, and that's probably wise. One thing we don't need to worry about is the execution of IO
computations in the typechecker: remember that the typechecker doesn't have to give launchMissiles
the same meaning that the run-time system does!
Hindley-Milner Culture
The Hindley-Milner type system achieves the truly awesome coincidence of four distinct distinctions, with the unfortunate cultural side-effect that many people cannot see the distinction between the distinctions and assume the coincidence is inevitable! What am I talking about?
- terms vs types
- explicitly written things vs implicitly written things
- presence at run-time vs erasure before run-time
- non-dependent abstraction vs dependent quantification
We're used to writing terms and leaving types to be inferred...and then erased. We're used to quantifying over type variables with the corresponding type abstraction and application happening silently and statically.
You don't have to veer too far from vanilla Hindley-Milner before these distinctions come out of alignment, and that's no bad thing. For a start, we can have more interesting types if we're willing to write them in a few places. Meanwhile, we don't have to write type class dictionaries when we use overloaded functions, but those dictionaries are certainly present (or inlined) at run-time. In dependently typed languages, we expect to erase more than just types at run-time, but (as with type classes) that some implicitly inferred values will not be erased. E.g., vReplicate
's numeric argument is often inferable from the type of the desired vector, but we still need to know it at run-time.
Which language design choices should we review because these coincidences no longer hold? E.g., is it right that Haskell provides no way to instantiate a forall x. t
quantifier explicitly? If the typechecker can't guess x
by unifiying t
, we have no other way to say what x
must be.
More broadly, we cannot treat "type inference" as a monolithic concept that we have either all or nothing of. For a start, we need to split off the "generalisation" aspect (Milner's "let" rule), which relies heavily on restricting which types exist to ensure that a stupid machine can guess one, from the "specialisation" aspect (Milner's "var" rule) which is as effective as your constraint solver. We can expect that top-level types will become harder to infer, but that internal type information will remain fairly easy to propagate.
Next Steps For Haskell
We're seeing the type and kind levels grow very similar (and they already share an internal representation in GHC). We might as well merge them. It would be fun to take * :: *
if we can: we lost logical soundness long ago, when we allowed bottom, but type soundness is usually a weaker requirement. We must check. If we must have distinct type, kind, etc levels, we can at least make sure everything at the type level and above can always be promoted. It would be great just to re-use the polymorphism we already have for types, rather than re-inventing polymorphism at the kind level.
We should simplify and generalise the current system of constraints by allowing heterogeneous equations a ~ b
where the kinds of a
and b
are not syntactically identical (but can be proven equal). It's an old technique (in my thesis, last century) which makes dependency much easier to cope with. We'd be able to express constraints on expressions in GADTs, and thus relax restrictions on what can be promoted.
We should eliminate the need for the singleton construction by introducing a dependent function type, pi x :: s -> t
. A function with such a type could be applied explicitly to any expression of type s
which lives in the intersection of the type and term languages (so, variables, constructors, with more to come later). The corresponding lambda and application would not be erased at run-time, so we'd be able to write
vReplicate :: pi n :: Nat -> x -> Vec n x
vReplicate Z x = VNil
vReplicate (S n) x = VCons x (vReplicate n x)
without replacing Nat
by Natty
. The domain of pi
can be any promotable type, so if GADTs can be promoted, we can write dependent quantifier sequences (or "telescopes" as de Briuijn called them)
pi n :: Nat -> pi xs :: Vec n x -> ...
to whatever length we need.
The point of these steps is to eliminate complexity by working directly with more general tools, instead of making do with weak tools and clunky encodings. The current partial buy-in makes the benefits of Haskell's sort-of dependent types more expensive than they need to be.
Too Hard?
Dependent types make a lot of people nervous. They make me nervous, but I like being nervous, or at least I find it hard not to be nervous anyway. But it doesn't help that there's quite such a fog of ignorance around the topic. Some of that's due to the fact that we all still have a lot to learn. But proponents of less radical approaches have been known to stoke fear of dependent types without always making sure the facts are wholly with them. I won't name names. These "undecidable typechecking", "Turing incomplete", "no phase distinction", "no type erasure", "proofs everywhere", etc, myths persist, even though they're rubbish.
It's certainly not the case that dependently typed programs must always be proven correct. One can improve the basic hygiene of one's programs, enforcing additional invariants in types without going all the way to a full specification. Small steps in this direction quite often result in much stronger guarantees with few or no additional proof obligations. It is not true that dependently typed programs are inevitably full of proofs, indeed I usually take the presence of any proofs in my code as the cue to question my definitions.
For, as with any increase in articulacy, we become free to say foul new things as well as fair. E.g., there are plenty of crummy ways to define binary search trees, but that doesn't mean there isn't a good way. It's important not to presume that bad experiences cannot be bettered, even if it dents the ego to admit it. Design of dependent definitions is a new skill which takes learning, and being a Haskell programmer does not automatically make you an expert! And even if some programs are foul, why would you deny others the freedom to be fair?
Why Still Bother With Haskell?
I really enjoy dependent types, but most of my hacking projects are still in Haskell. Why? Haskell has type classes. Haskell has useful libraries. Haskell has a workable (although far from ideal) treatment of programming with effects. Haskell has an industrial strength compiler. The dependently typed languages are at a much earlier stage in growing community and infrastructure, but we'll get there, with a real generational shift in what's possible, e.g., by way of metaprogramming and datatype generics. But you just have to look around at what people are doing as a result of Haskell's steps towards dependent types to see that there's a lot of benefit to be gained by pushing the present generation of languages forwards, too.
John that's another common misconception about dependent types: that they don't work when data is only available at run-time. Here's how you can do the getLine example:
data Some :: (k -> *) -> * where
Like :: p x -> Some p
fromInt :: Int -> Some Natty
fromInt 0 = Like Zy
fromInt n = case fromInt (n - 1) of
Like n -> Like (Sy n)
withZeroes :: (forall n. Vec n Int -> IO a) -> IO a
withZeroes k = do
Like n <- fmap (fromInt . read) getLine
k (vReplicate n 0)
*Main> withZeroes print
5
VCons 0 (VCons 0 (VCons 0 (VCons 0 (VCons 0 VNil))))
Edit: Hm, that was supposed to be a comment to pigworker's answer. I clearly fail at SO.
pigworker gives an excellent discussion of why we should be headed towards dependent types: (a) they're awesome; (b) they would actually simplify a lot of what Haskell already does.
As for the "why not?" question, there are a couple points I think. The first point is that while the basic notion behind dependent types is easy (allow types to depend on values), the ramifications of that basic notion are both subtle and profound. For example, the distinction between values and types is still alive and well; but discussing the difference between them becomes far more nuanced than in yer Hindley--Milner or System F. To some extent this is due to the fact that dependent types are fundamentally hard (e.g., first-order logic is undecidable). But I think the bigger problem is really that we lack a good vocabulary for capturing and explaining what's going on. As more and more people learn about dependent types, we'll develop a better vocabulary and so things will become easier to understand, even if the underlying problems are still hard.
The second point has to do with the fact that Haskell is growing towards dependent types. Because we're making incremental progress towards that goal, but without actually making it there, we're stuck with a language that has incremental patches on top of incremental patches. The same sort of thing has happened in other languages as new ideas became popular. Java didn't use to have (parametric) polymorphism; and when they finally added it, it was obviously an incremental improvement with some abstraction leaks and crippled power. Turns out, mixing subtyping and polymorphism is inherently hard; but that's not the reason why Java Generics work the way they do. They work the way they do because of the constraint to be an incremental improvement to older versions of Java. Ditto, for further back in the day when OOP was invented and people started writing "objective" C (not to be confused with Objective C), etc. Remember, C++ started out under the guise of being a strict superset of C. Adding new paradigms always requires defining the language anew, or else ending up with some complicated mess. My point in all of this is that, adding true dependent types to Haskell is going to require a certain amount of gutting and restructuring the language--- if we're going to do it right. But it's really hard to commit to that kind of an overhaul, whereas the incremental progress we've been making seems cheaper in the short term. Really, there aren't that many people who hack on GHC, but there's a goodly amount of legacy code to keep alive. This is part of the reason why there are so many spinoff languages like DDC, Cayenne, Idris, etc.
참고URL : https://stackoverflow.com/questions/12961651/why-not-be-dependently-typed
'Programing' 카테고리의 다른 글
항상 git 자격 증명을 요구하는 Visual Studio Code (0) | 2020.06.04 |
---|---|
IEntityChangeTracker의 여러 인스턴스에서 엔티티 오브젝트를 참조 할 수 없습니다. (0) | 2020.06.04 |
“node_modules”폴더가 git 저장소에 포함되어야합니다 (0) | 2020.06.04 |
JavaScript에서 Java의 Thread.sleep ()에 해당하는 것은 무엇입니까? (0) | 2020.06.04 |
CSS 만 사용하여 부트 스트랩 아이콘에 색상을 추가 할 수 있습니까? (0) | 2020.06.04 |